Autor: Charlie Elgholm Data: Dla: exim-dev Temat: [exim-dev] Feature request: Option to treat particular
message-specific retries as being host-specific
Background
I lot of SMTP-servers have started throttling / rate-limiting the
incoming emails, usually based on the senders (SMTP) IP-address. When
processing a large queue, where many emails are to the same domain,
their rate-limiting gets triggered, but Exim keeps trying each and
every message in the queue for that domain because the retry-rule is
message-specific, and Exim needs to see a 4xx-message for each and
every message before the message is placed in "retry"-mode.
Example
When generating unique reports / mailings to our customers, we might
end up with 2500 emails going to domain X, out of perhaps 50000
messages. Exim starts processing the queue, we start many
queue-runners, and after 100 emails to domain X we start getting a
"4xx-please back off"-message, but Exim continues to try the remaining
2400 messages anyway. After a while, the retry-rule says try again, we
either a) get a "4xx-please back off" immediately or b) completes
another 100 emails, but still tries the remaining 2300 emails, and so
on.. One might think that the remote SMTP-server probably extends the
"waiting time" for our server, since we're acting ignorant ;)
Idea
Have an option, perhaps a regex-filter, so that some replies can
indicate to Exim queue-runners to immediately place all those hosts in
the "retry"-state (or other states).
Problem
If there are many queue-runners, which there are, they all need to be
made aware of the new status for the host in question, so that they
all skip those messages during their queue-run.
Example of messages that could be filtered:
- "421 .* Unfortunately, some messages from <ip> weren't sent. Please
try again. We have limits for how many messages can be sent per hour
and per day."
- "421 .* Messages from <ip> temporarily deferred"
Please note that both examples are on 421-messages, this doesn't need
to be the case though - since we've seen others in our logs - just
can't find them right now.
Existing solutions
We've seen solutions based on different configurations that
"identifies" the message when it's being received by Exim, and then
either queues it in different named queues*, pushes it to special
throttled transports or forward it to another smarthost (i.e Postfix)
that implements the rate-limiting in its queue.
(*The named queues are then run outside of Exim's normal
queue-running, with different rate-limiting techniques, cron/sleep.)
Problem with existing solutions
They all work on the fact that we know in advance what
domains/IP:s/MX:s that need the special handling, and they also
somewhat mess up an otherwise clean and intuitive configuration/setup.
If we implement some kind of new option/filtering, we can have:
/We have limits for how many messages can be sent per hour and per
day/ => retry-all-in-queue @$domain / 1h
/Messages from .* temporarily deferred/ => retry-all-in-queue @$domain / 1h
/4\d\d/ => retry-all-in-queue @$domain
The last one being a "catch all".
You can of course also do other things, not just "retry-all-in-queue",
and there's perhaps no need to specify the "1h", since we already have
the ability for that in the retry-rules.
As Jan Ingvoldstad pointed out on exim-users it's really the MX-host
top-domain that we should rate-limit, but a good start would be to
just use the recipient's domain for now since I guess that would be
much more easier to implement. =)
Benefit
Exim would immediately start playing nice with those SMTP-servers that
tell it to back off, and we don't have to identify them all in
advance, but if we're inclined to we might have better retry-rules for
those domains.
If you think this is not needed for Exim, or you think that I should
have posted this somewhere else, or perhaps used better language,
well, I'm sorry. I'm swedish. We're good at that, being sorry... =)