Re: [exim] maildirsize file and massive concurrency

Top Page
Delete this message
Reply to this message
Author: W B Hacker
Date:  
To: exim users
Subject: Re: [exim] maildirsize file and massive concurrency
Heiko Schlittermann wrote:
> Hello,
>
> probably some admin of a larger site using maildir++ and quota
> is reading here.
>
> In order to support quota on a maildir++ storage Exim is configured to
> maintain the "maildirsize" file. There are several rules that trigger a
> recalculation of the used space and a rewrite of this "maildirsize"
> file. On a busy mailbox the rule "filesize> 5120 bytes" applies quite
> often.
>
> The result of the recalculation process is used for quota decision.
> Additionally the result gets written into "maildirsize", BUT ONLY
> IF during the recalculation no changes to any of the relevant
> directories happen
>
> On a busy system and a full mailbox attached via NFS the recalcuation
> takes noticable time (several seconds), and during this time new mails
> are flooding the mailbox.
>
> Thus often die result of the recalculation gets junked. To make it
> worse, several parallel deliveries then start the recalculation in parallel.
> Again - the results are used, but the "maildirsize" is not created.
>
> Then we get a nice "stau" in the queues and a quite impressive load.
>
> I get the feeling, that using "maildirsize" is not an option here. On
> the other hand, the POP/IMAP server (courier) uses it for it's quota
> calculation too. Hacking the code (exim and courier) is possible, but
> not preferred.
>
> How is this solved on other busy sites?
>
>


CAVEAT:

- no experience here with quotas. Prefer larger HDD, and/or cron job moving part
of the mailstore off-box to near-line store.


Questions:

- do other POP/IMAP do this any differently? (eg: Dovecot)

- would it make sense to have either the POP/IMAP OR the MTA simply cease
*making* calculations of its own and just query the results of calculations made
by the OTHER daemon? Might be inaccurate, but at least less load and no longer
contentious.

- Is there enough spare 'cushion' capacity that BOTH POP/IMAP and MTA could be
told to NOT *calculate* quota, but rather check a quota computed by a
third-party external that could be run lesss frequently, with less intrusive
effect? Once an hour or once a day, even. Again, less accuracy, but perhaps
lower load yet, no contention, and should still be 'good enough' to prevent
uber-hogging.

- is it time to split the mailstore over multiple servers?

- is it time to split the user load itself over multiple servers?

We use per-user unique mailstore location and type (Maildir or otherwise),
synced between/among MTA, IMAP, Webmail. SQL here, but it doesn't have to be.

Hope there is at least an idea or two in there for you...

Bill