[Exim] exim handling of 5xx fatal errors

Top Page
Delete this message
Reply to this message
Author: Ian Garrison
Date:  
To: exim-users
CC: bitstream
Subject: [Exim] exim handling of 5xx fatal errors
This has probably been discussed, but I still have to ask as googling
and digging around on exim.org hasn't produced much fruit.

Exim (we run 3.35) handles 500 level failures a little differently than
most mta's. Most mta's if a critical failure occur will not process a
critical'd email in a queue run, but exim goes even further to not send
ANY email to the mail server that returned the critical failure for a
queue run.

Consider this scenario:

One has 10k emails spooled up to send to an external mailserver, with
5k being legit and 5k being spam, and the external mailserver does
VRFY-like user checking (return 550 'user not found') because of a spam
firewall they run. As the mailq is flushed the legit emails at the top of
the queue will go through just fine, but when some spam delivery is
attempted exim will defer all mail being sent to that mailserver for the
remainder of the queue run. This leaves legit email stuck in the queue
longer than it should need to be. On the next queue run a few legit
emails are sent, but then a spam or two causes the process to repeat
again. Eventually some portion of legit email has been in the mail queue
for long enough that it is wiped out (we have 5 day queue retention in our
exim config). I'm curious if there is an elegant configuration change to
dodge the issue above, but even more curious as to why exim is different
in handling this problem than most mta's.

Anyway, I am having a discussion with one anti-spam appliance vendor
and pointed out the scenario above, their reply was "exim does not comply
to the rfc's in this regard". They muttered something about rfc 822 and
2822, both of which don't spell out the answer I'm looking for. Can
anybody comment on rfc-correctness of 'stop sending all mail destined to
one mail server that returned a critical error for the duration of one
queue run'? Exim does do something different from the masses and I cannot
see any behavior in such a case as documented in an RFC (suspect its an
implementation detail). Does exim maintain this behavior through all
versions?

It is my opinion that in situations like the one above that its best if
everybody always receives email. They can /dev/null or reroute it
afterwards, but terminating an smtp session with some error code that has
different meaning depending on mta implementation seems screwy (550 might
be 'user not found' for some or 'server not available' for others). This
rids folks of the rumplestiltskin VRFY type attacks (amazes me that this
issue resurfaced, was dealt with in the past with 'turn off VRFY', but has
returned again in antispam products). Always receiving mail seems more
compatable with the majority of mta's, especially exim, and it also keeps
the mail queues free from spam/garbage that is doomed to permanent
failures for its lifespan in the queue. Does anybody see a serious flaw
in my thinking on this point? While I have such opinions it would appear
that the industry doesn't think the same way and I suspect that I'm
missing something. I am not getting an enlightened response from the spam
filtering vendor aside from 'exim is not rfc compliant in this regard, use
something else'.

Please include my email address in any replies as I am not subscribed
to exim-users. Hopefully I am worthy of some feedback from those more
experienced with exim and mta correctness.

Thanks,
-ian