Re: [exim] Weird loads with maildir_use_size_file

Top Page
Delete this message
Reply to this message
Author: Gergely Nagy
Date:  
To: exim-users
Subject: Re: [exim] Weird loads with maildir_use_size_file
(If this appears twice, sorry - I sent my first reply using the wrong -
unsubscribed - address, so I'm resending with the address I subscribed
to the list with)

>> The bulk of the issue is, that I have a configuration which Works(tm),
>> it's fast, reliable and works like a charm. However, I would like to use
>> maildirsize files, but whenever I turn maildir_use_size_file on in the
>> appropriate transport, the load goes from the usual 10-20 to 600 and
>> above within half a minute. I believe it would rise even further, but so
>> far, I always turned the option off again before that.
>
> The *usual* 10-20? Wow. That smacks of wedged disks already to me. How
> are your disks laid out?


Well, since there are about 20 deliveries running most of the time 24/7,
plus a few hundred users logged in on imap, 10-20 on a double quad-core
box isn't THAT high.

With 95% of the deliveries happening within 1 second, I believe the
disks are fine (the other 5% is extra spam scanning on another machine,
network traffic & spam scanning takes a few extra seconds).

As for the disk layout: 6 SAS disks are in the box, in raid 10 (software
raid), and on top of it we have LVM, with 3 vgs, one for the exim spool,
one for the logs and a big chunk for the mail store.

The whole /m is a single volume, the layout was merely set up to ease
the load on the filesystem (ext3, mounted noatime).

We experimented with having the mail store set up to use indepented
disks for the various subdirs, but that proved to be problematic
(scaling issues, with regards to disk space, and a pain in the backside
to extend, and so on), while the gains were quite minor.

>> Without much further ado, the relevant transport in my exim4.conf looks
>> like this:
>>
>> local_delivery:
>> driver = appendfile
>> directory = /m/${l_1:$local_part}/${local_part}@${domain}
>> quota_directory = /m/${l_1:$local_part}/${local_part}@${domain}
>
> These might be useful shortly, combined with the answer you give to my
> earlier question.
>
>> The hardware in question is a dual quad-core Intel Xeon, with 4G ram and
>> a couple of SAS disks appropriately set up to handle the load. It's
>> running on Debian etch, with Exim 4.63 (+ whatever patches Debian applied).
>
> "Appropriately set up" - how? Mirrored in hardware, software? Striped?


See above.

>> The usual load is between 10 and 30 (during the busiest hours), handling
>> 15-25 mails / second with maildir_use_size_file turned off.
>
> Goodness me. That load already smells fairly bad to me - it indicates
> that some processes are either on the CPU permanently or waiting for
> disk IO.


It's mostly IO. But, like I said in the original mail, the system can
handle the amount of traffic, despite the load being ~30. While delivery
times and response times are nearly instant, we don't really care about
the load :)

Right now, the load is only 6, I'll post an iostat later when it gets
higher.

>>From your tests, and the info provided above, I'd suggest that two disks
> simply is not enough for this platform. I'd hazard a guess that the
> heads are seeking around at a massive rate which causes both reads and
> writes to be delayed, to the point where (at say 10 deliveries/sec) you
> end up with 600 processes waiting for the disks after 60 seconds, thus
> pushing the load through the roof.


It's 6 disks. "couple" meaning a "few" - apologies if my english wasn't
entirely clear.

As for the guess: at the time of the 600 load, there were about 40 exim
processes running (courier was shut down for the duration of the test,
so no users were hogging the disks at the time).

I'll try to do a more concrete test sometime today.

> There are some small things you can do:
>
> 1. Mount /m with the "noatime" option - this can be done in normal
> operation with only a tiny interruption:
>    mount -o remount,noatime,nodiratime /m


Been done so from day 1.

> 2. Ensure your filesystems are using the "dir_index" option (only
> applicable to EXT3). If you need to change it, see the tune2fs man page
> and be prepared for an outage to do the conversion. You can see if it's
> switched on by doing:
>
>    tune2fs -l /dev/sda1 (or whatever the partition is)

>
> Look for the features line, and see if it looks something like this:
> Filesystem features:      has_journal ext_attr resize_inode dir_index
> filetype needs_recovery sparse_super large_file


Also done.

mstore-1:~# grep mstore /proc/mounts
/dev/vg1/mstore /m ext3 rw,noatime,data=ordered 0 0

mstore-1:~# tune2fs -l /dev/vg1/mstore | grep dir_index
Filesystem features:      has_journal resize_inode dir_index filetype
needs_recovery sparse_super large_file


> 3. Add more disks to your array, if you can. How that happens depends on
> your hardware.
>
> It almost certainly looks to me like your disks just aren't quick enough
> to keep up.


It may look so - but why can courier update the maildirsizes, and why
exim can't, without making the load go through the roof?

My first idea was the same, that we need more disks (we do, but for
different and unrelated reasons).

However, the mail queue is empty all the time, deliveries happen within
a second, and if I throw an _additional_ 20 mails / second at the
system, it still delivers them within a few seconds (without the queue
growing beyond a controllable (roughly 100 mails) state).

Also, if the problem would be with disk performance, then the system
would grind to a halt during busy hours anyway, AND the load would not
jump from 10 to 600 within 30 seconds, but slowly climb up, in my opinion.

The weird thing is NOT the load, the 30 is fine. The problem is the load
rising to 600 within a very short amount of time when
maildir_use_size_file gets turned on, even though it should NOT generate
that much disk io. It merely does a few directory listings, and we don't
have so huge directories that this should cause significant load (if we
did, then deliveries to that particular directory would be horribly slow
as well, and would make the load rise even without
maildir_use_size_file; in the best case, when the user tries to download
it all via pop3).

I mean, if it generates more disk io than 10 du processes running on the
storage area, and ~30 users downloading their mail via pop3, and me
bombing a mailbox with 5-10 messages a second, then there's a problem,
and it's not with the disks, as those handle the load reasonably well.

It's not with Exim per-se, either, as this same box used to be the
primary mail server, handling imap, pop3, incoming and outgoing mail
along with virus & spam filtering, and it had maildir_use_size_file
enabled at that time.

We moved the outgoing mail, the virus & spam filtering to another server
to ease the load, and redid the configuration during the process.

And that's when maildir_use_size_file started to show the effects I see
now - so the problem MUST be in my configuration, I just don't see
where, and am running out of ideas where to look.

--
Gergely Nagy <gergely.nagy@???>