Re: [exim] maildirsize file and massive concurrency

Top Page
Delete this message
Reply to this message
Author: Heiko Schlittermann
Date:  
To: exim-users
Subject: Re: [exim] maildirsize file and massive concurrency
Hi,

thank you for responding…

W B Hacker <wbh@???> (Thu Feb 17 03:48:30 2011):
> - no experience here with quotas. Prefer larger HDD, and/or cron job moving part
> of the mailstore off-box to near-line store.


Larger HDD is not an option here. We're talking about 120e3 mailboxes,
the data is growing about 60…70G per day.

Cron job is not an option, scanning the whole storage takes ages.

Moving older messages somewhere else -- what is "older"? Date of
delivery, last access, ??? As IMAP user I'd expect quota in size and
numbers, but not in age of the message.

> Questions:
> - do other POP/IMAP do this any differently? (eg: Dovecot)


Since POP/IMAP access is magnitudes less frequently, this side of the
problem does not matter.

We just need a common denominator where both sides (exim | courer) find
information about the quota and used size. The "maildirsize" file is
such an common denominator, but it seems not being suitable for
massive parallel delivery.

> - would it make sense to have either the POP/IMAP OR the MTA simply cease
> *making* calculations of its own and just query the results of calculations made
> by the OTHER daemon? Might be inaccurate, but at least less load and no longer
> contentious.


Yes, we're considering some "meta data daemon", but this would ask for
patching at least the courier side. It's not impossible, but we'd like
to stay with vanilla software.

> - is it time to split the mailstore over multiple servers?


The mailstore isn't the bottles neck, it's fast, and Solaris clients
need about 0.4 seconds to readdir() all entries from our biggest
mailbox, but the Exims are running on Linux and the Linux NFS client
needs about 4 seconds for such scan. Thus I'd say it's not the fault of
the mailstorage.

> - is it time to split the user load itself over multiple servers?


It's done already and leads us to the above mentioned problems. Multiple
parallel deliveries into the same mailbox, causing parallel
recalculations.

--
Heiko :: dresden : linux : SCHLITTERMANN.de
GPG Key 48D0359B : 3061 CFBF 2D88 F034 E8D2 7E92 EE4E AC98 48D0 359B