Autor: Erwin Andreasen Data: A: exim-users Assumpte: [exim] Dealing with remote side throttling connections and large
mail queues
I'm sending a decent amount of emails per day, with most of the coming in large
batches causing rather large mail queues to pile up (100-200k). I use the split
spool directory and start many few external queue runners to speed up delivery.
That works relatively well (10-20 mails delivered per second), until I run into
delivery to providers like yahoo which have some kind of rate limiting. After a
few thousand messages Yahoo responds with 4XX for quite a while.
That seems to cause exim trouble: it looks like many of my queue runners are
now spending their time constantly rescanning through the spool and keep trying
to deliver those messages to all the different MX hosts that yahoo has.
Or are my multiple queue runners to blame here, trying to deliver the same
message? I see something like this: the same messge-ID is delivered to 3
different MXs in succession.
My strategy for starting up those external queue runners is basically to run an
exim -R for the largest hosts with oustanding mail in the queue (assuming that
they can accept mail quickly so the queue can clear up) and some general -q
runs (for all other hosts), if system load is not too high.
Can I somehow limit the retries on different MXes? Only use one (random?) MX
host before giving up and waiting for the retry time? I saw some mail on the
list archives which gave a -R option that would deliver anything *but*
recipients matching yahoo.com, so maybe that's the way to go for me.
Alternatively I'm thinking of using several separate exim instances, maybe one
dedicated to yahoo.com that I can pause on defer problems. Perhaps it'd be
better to instead of e.g. 50 runners all scanning through a 100,000 mail queue,
have 5 sets of 10 runners, each set having its own 20,000 message mail queue.
I'll note that I also sign outgoing email with DomainKeys and as that signature
is not stored in the spool file I guess exim must re-sign it every time it
attempts a delivery, which probably doesn't help on the CPU usage.