Re: [exim] Weird loads with maildir_use_size_file

Góra strony
Delete this message
Reply to this message
Autor: Phil Pennock
Data:  
Dla: Gergely Nagy
CC: exim-users
Temat: Re: [exim] Weird loads with maildir_use_size_file
On 2008-06-18 at 13:04 +0200, Gergely Nagy wrote:
> The bulk of the issue is, that I have a configuration which Works(tm),
> it's fast, reliable and works like a charm. However, I would like to use
> maildirsize files, but whenever I turn maildir_use_size_file on in the
> appropriate transport, the load goes from the usual 10-20 to 600 and
> above within half a minute. I believe it would rise even further, but so
> far, I always turned the option off again before that.


Exim doesn't use the maildirsize file itself, it's used for other
programs. For other programs, it's a cache. For Exim, in order to
create it, for any directory either not already present or where the
timestamp differs from the recorded one, Exim needs to readdir() the
directory.

If this is turned on globally, then all at once you're having to
readdir() pretty much every maildir directory on disk, from processes
fighting with each other, including multiple deliveries to the same
user.

I would suspect that you're thrashing memory. Testing against one user
wouldn't show problems, because the problem is when all the deliveries
are fighting each other at the same time.

Running "vmstat 1" whilst enabling this would confirm.

It might be possible to turn this on during a quiet time of day and let
it build; alternatively, you could duplicate local_delivery to have
"local_delivery" followed by "local_delivery_nosize" and on the first
set maildir_use_size_file and have a condition rule restricting it to a
subset of the userbase, which you can expand slowly.

You might use a check on letter of the alphabet, or you might use
something like:
condition = ${if <{${nhash{SIZE_ROLLOUT_MAX}{$local_part@$domain}}}{SIZE_ROLLOUT}}
and set SIZE_ROLLOUT_MAX to, say, 20 and then increment SIZE_ROLLOUT
slowly to roll it out to 5% increments of your userbase.

Once you're at SIZE_ROLLOUT == SIZE_ROLLOUT_MAX and comfortable, remove
the second Router and the conditions on the first Router.

The other approach is to mess with the kernel tuning parameters to
increase the size or weighting given to the directory cache. I'm not
up-to-date on Linux VM tuning.

Regards,
-Phil