I'm posting this not so much because I have a solution, but because the
problem reminded of a compiler design course I took many, many years ago,
and I thought I'd throw this out for discussion.
On Tue, Mar 08, 2005 at 04:55:17PM +0000, Philip Hazel wrote:
> Suppose a message is addressed to both alice and bob, and after two
> routing cycles, these addresses have been aliased like this:
>
> alice bob <== start
> | |
> V V
> alice-1 bob-1 <== after first cycle
> | |
> V V
> alice-2 alice-2 <== after second cycle
This problem seems to be similar to that which a compiler encounters
when trying to optimize code with common subexpressions.
Consider: a=(b*c)+(b*c)
A compiler constructs a Directed Acyclic Graph merging subtrees when
it sees common subexpressions:
=
/ \
a +
/ \
| |
\ /
*
/ \
b c
> Now, at this point Exim notices that it has two identical addresses, so
> it drops one as a duplicate. When I first implemented this logic, I
> thought "routing an address will always give the same answer, so it
> does not matter which address is discarded". Unfortunately, my premise
> was utterly wrong.
Instead of throwing away one of the duplicate addresses (and the
associated ancestor tree), could the two trees be merged into a DAG?
That way all common ancestors can be accounted for.
Would a similar data structure and algorithm be applicable here?
Does that save all the information we need?
More importantly, is email routing according to the current "rules"
non-deterministic?
(I'll stop here but keep the remainder of Philip's analysis below.)
Steve
> Suppose that routing alice-2 causes it to be redirected to alice-1 (this
> is in effect what was happening in Harald's situation). If this is the
> alice-2 that came from alice, Exim will treat alice-1 as a "redirection
> to ancestor" because it will find alice-1 further up the ancestor chain.
> That case works: you are allowed to forward to yourself. The router that
> processed the original alice-1 will be skipped when the second alice-1
> is routed.
> However, if it is bob's alice-2 that is retained, once it is redirected
> to alice-1, Exim discards it as a duplicate (because alice-1 has been
> handled in a different ancestor chain). Thus, in this case, no delivery
> at all occurs because all the addresses have been discarded.
> This is of course a total disaster.
> It occurs to me also that the existence of the redirect_router option
> could make this worse. Each alice-2 could have a redirect_router setting
> that caused it to be handled entirely differently, for example, one
> might redirect to adam and the other to eve.
> My reaction to this discovery is .... Aarrgghh!!
> I'm afraid nothing can be done about this in the short term, but I guess
> there should be a long-term plan to redesign this part of Exim from
> scratch. In fact, the duplicate detection logic has become messy over
> time and has caused a number of problems. See, for example, the
> following ChangeLog entries:
> 4.11/17 4.11/85 4.05/31 4.02/17 3.953/55 3.21/7 ...
> So it's clearly a rather fragile area.
> Sorry, folks.
> Philip
----------------------------------------------------------------------
Steven A. Reisman <sar@???> P.O. Box 409
Press Enter LLP 421 N 2nd Street
715-426-2100 or 651-436-5254 River Falls, WI 54022
----------------------------------------------------------------------