[I am copying this to exim-users because that's where the original
message was posted. However, I'm also copying it to exim-dev, and
setting reply-to to exim-dev, because it's really a development issue.
If there is any discussion, it should be on exim-dev.]
On Tue, 18 Jan 2005, Harald Schueler wrote:
> Or maybe I'm stretching Exim's configuration beyond what can be expected to
> work? Or have I made a stupid mistake?
I have now managed to analyze this to the point where I understand what
is going on. I have also managed to make a slightly simpler
configuration that exhibits the bad behaviour.
The Bad News is that I don't think anything can be done about it without
a complete re-design of the way Exim handles duplicate addresses. At
present, it tries to eliminate them during routing, but I suspect that,
with the features that are now available in Exim, the only correct way
would be to do the elimination after all the routing has been done. This
is clearly a major re-design and re-writing exercise.
Let me try to explain the fundamental problem (= bug = misdesign).
Exim operates in routing cycles: in each cycle it passes each address
through the routers, accumulating any newly-created redirected
addresses. These are then processed afresh in the next cycle.
Suppose a message is addressed to both alice and bob, and after two
routing cycles, these addresses have been aliased like this:
alice bob <== start
| |
V V
alice-1 bob-1 <== after first cycle
| |
V V
alice-2 alice-2 <== after second cycle
Now, at this point Exim notices that it has two identical addresses, so
it drops one as a duplicate. When I first implemented this logic, I
thought "routing an address will always give the same answer, so it
does not matter which address is discarded". Unfortunately, my premise
was utterly wrong.
Suppose that routing alice-2 causes it to be redirected to alice-1 (this
is in effect what was happening in Harald's situation). If this is the
alice-2 that came from alice, Exim will treat alice-1 as a "redirection
to ancestor" because it will find alice-1 further up the ancestor chain.
That case works: you are allowed to forward to yourself. The router that
processed the original alice-1 will be skipped when the second alice-1
is routed.
However, if it is bob's alice-2 that is retained, once it is redirected
to alice-1, Exim discards it as a duplicate (because alice-1 has been
handled in a different ancestor chain). Thus, in this case, no delivery
at all occurs because all the addresses have been discarded.
This is of course a total disaster.
It occurs to me also that the existence of the redirect_router option
could make this worse. Each alice-2 could have a redirect_router setting
that caused it to be handled entirely differently, for example, one
might redirect to adam and the other to eve.
My reaction to this discovery is .... Aarrgghh!!
I'm afraid nothing can be done about this in the short term, but I guess
there should be a long-term plan to redesign this part of Exim from
scratch. In fact, the duplicate detection logic has become messy over
time and has caused a number of problems. See, for example, the
following ChangeLog entries:
4.11/17 4.11/85 4.05/31 4.02/17 3.953/55 3.21/7 ...
So it's clearly a rather fragile area.
Sorry, folks.
Philip
--
Philip Hazel University of Cambridge Computing Service,
ph10@??? Cambridge, England. Phone: +44 1223 334714.