Auteur: W B Hacker Date: À: exim users Sujet: Re: [exim] maildirsize file and massive concurrency
Todd Lyons wrote: > On Sun, Feb 20, 2011 at 2:08 AM, Phil Pennock<exim-users@???> wrote:
>> One approach might be to bias which hosts receive mail for a given
>> mailbox; you said you had a two-stage system (good); so the front stage
>> uses ${nhash_4:$local_part@$domain} to get a number 0-3; you use this to
>> select which of the backing hosts is used, but set "fallback_hosts" to
>> the complete list of backing hosts, so that in the event of a machine
>> failure you still get delivery. Thus normally all mails for one address
>> go to just one host and you maintain NFS cache coherency.
>
> Ok, now single stage systems. I was going to make a router/transport
> that would do use this hash selection method and do lmtp direct to
> exim. Seemed like a good idea, but then I quickly realized that exim
> can speak lmtp when sending, but it does not do lmtp inbound, stopped
> me dead in the water. The only other option that I can see is to make
> a hostlist of my exim mail servers, receive the email, but instead of
> delivering locally, do the hash function to select one of them and
> forward it to that one (doing the local delivery if it happens to be
> itself). The only issue I can see with that, is that the quota isn't
> checked until the second instance (if it's not the local machine)
> tries to do the delivery. The resulting backscatter is what I'm
> trying to avoid in the first place by having a single stage system.
>
> Any suggestions or comments?
>
Go for BFBI.
No need for a 'protocol-laden' delivery for single-staging.
- Start with 'enough' resources to support 'many' exim child process runs
(quad or hex core CPU, much RAM) on any given box [1].
- give each box multi-channel HBA (SATA is good enough)
- router/transports deliver the messages across many I/O channels to separate
HDD [arrays]. Specifically - not all local recipients are on the same
mount-points. (and never, ever on '/var/' or the same HDD as system, BTW [2])
BUT - ALL mounts are 'local' - not NAS, SAN, or nfs. So long as there is no
protocol more complex or latency-challenged than ATA or SCSI involved, the OS
handles such throughput quite well.
So NOT hashed. Lookups. Storage has to be in predictable locations so the same
lookup by POP/IMAP finds them. Ours use part of each user's 'normal' primary
server ID as part of the dirtree so that the same SQL return finds the same
(mirrored) files even if attached to the failover/backup server. And the reverse...
..and it is STILL wise to:
- limit how many and/or what percentage of users go on a single [point of
failure] box,
- reserve pre-configured space on siblings to handle temporary failover coverage
for each other.
- engineer and test failover/recovery to be as painless as you can make it.
Such as in-place copies of the same certs for the IP's they would assume so smtp
submission or POP/IMAP clients don't hiccup...
S**t does break. Peer MTA will retry but MUA are *way* less forgiving.
so it is the POP/IMAP absence that will be noticed first and generate the
complaint calls first.
Design that part first, make it rock-solid, then configure the MTA to fit, and
get to enjoy the REST of life.
;-)
Bill
[1] Alternatives - the VIA CPU family. Not much to write home about
performance-wise, and overtaken by AMD & intel's latest on thermal/power
efficiency, BUT .. in the smtp, POP/IMAP/https webmail world, the hardware
encryption engine let's these lowly CPU punch as much as 20 times above their
nominal weight since near-as-dammit *everything* wants SSL/TLS, and all day long...
[2] Separate HDD even if NOT RAID. The *sole head-positioner* on a HDD should
not have to do /configure reads from ~/etc, log writes to /var, queue writes,
then reads, mailstore writes, then POP/IMAP reads, yadda, yadda....
Get at least two, preferably more head-positioners AND spindles (RAID 1 is cheap
and cheerful) into the mix and life smooths out dramatically. Decent drives
(ABS) last as much as double their warranty as well.