[exim] Bug in duplicate detection?

Top Page
Delete this message
Reply to this message
Author: Harald Schueler
Date:  
To: exim-users
New-Topics: [exim-dev] Re: [exim] Bug in duplicate detection?
Subject: [exim] Bug in duplicate detection?
Or maybe I'm stretching Exim's configuration beyond what can be expected
to work? Or have I made a stupid mistake?

One of our users (let's call her Alice) reported that she missed _some_
mails, while others from the same person arrived. She gets copies of all
mails adressed to "Bob <bob@newdomain>" and also of course mail
addressed to her own address "Alice <alice@newdomain>". But when mail
was addressed to "bob@newdomain, alice@newdomain" (not knowing she would
already get a copy of the mail to Bob) it was delivered only to Bob
(while, interestingly, mail addressed to "alice@newdomain bob@newdomain"
was deliverd as expected.

The system in question is a mail hub for several dozen domains. No mail
is delivered directly on the server, but instead routed to one of
several backend mailbox servers of different flavours (for historical,
political and technical reasons) using SMTP or LMTP. Each user has a
unique userid, and each userid has a unique mailbox on exactly one of
the mailbox servers, although any number of mail addresses from any of
the domains can be associated with the mailbox.

This is implemented on the hub by first looking up the userid and
rewriting the envelope recipient to "<userid>@local", then performing
all user specific actions (filtering, autoreplies, forwarding,...), and
at last setting up a transport for the destination mailbox server.

Most mailbox servers are quite happily receiving mail adressed to
"<userid>@local", but we recently "inherited" a few servers not under
our control, which require the envelope recipient to be in the form of
"<local_part>@olddomain>" where <local_part> may be the userid or in a
special case even an alias like in "Bob@olddomain". There is an
additional complication in that this address must still be recognized as
a valid recipient on the mail hub, so there exists the possibility of
creating a loop (userb@olddomain -> userb@local -> userb@olddomain),
which must be handled by the configuration. I think it this fact that
fools Exim's duplicate detection.

I have distilled the important parts of the configuration into the
following self-contained exim configuration, which reproduces the problem:

#==================================================
#/etc/exim4/exim4.conf

qualify_domain = local

begin routers

# Find owner (userid) of email address.
# In reality this is a table lookup.
alias:
   driver = redirect
   data = ${extract{$local_part@$domain}{ \
     alice@newdomain = usera \
     alice@olddomain = usera \
     usera@olddomain = usera \
     \
     bob@newdomain   = userb \
     bob@olddomain   = userb \
     userb@olddomain = userb \
   }}


# Now do all sorts of fancy user specific things (filtering,
# forwarding, anti-spam-settings,...)
# Forwarding is where the problem shows up.
forward:
   driver = redirect
   domains = local
# userb is forwarding a copy of all of his mail to Alice,
# using her old uid-style address (which is the same address which
# has to be set for deliveries to oldserver. This is necessary for
# the problem to turn up.
   data = ${extract{$local_part}{ \
     userb = userb,usera@olddomain \
   }}


# This is a bit ugly, I would love to see a better solution:
# We need to setup a transport to oldserver, but
# unfortunately oldserver is not happy with the destination uid@local,
# rather it needs a "matching" envelope recipient (think of an
# MS exchange server not under our control). So we rewrite the
# recipient in the necessary form, which is also a valid recipient
# on incoming mail. Then we have to setup the transport to oldserver,
# but Exim does not allow setting a transport in the redirect router,
# so we have to simulate it with a second router.
oldserver:
driver = redirect
domains = local
local_parts = usera : userb
# use address_data to select transport in the next router
address_data = oldserver
# rewrite envelope recipient
data = $local_part@olddomain
# This prevents the loop, but... confuses Exim?
redirect_router = localmux

localmux:
driver = accept
condition = $address_data
transport = $address_data

begin transports

oldserver:
driver = smtp
hosts = 10.10.10.10
#==================================================

So now all single-recipient mails work as expected, even

# exim -N alice@newdomain bob@newdomain < /dev/null
LOG: MAIN
<= root@local U=root P=local S=241
LOG: MAIN
*> userb@olddomain <bob@newdomain> R=localmux T=oldserver
H=10.10.10.10 [10.10.10.10]
LOG: MAIN
*> usera@olddomain <alice@newdomain> R=localmux T=oldserver
H=10.10.10.10 [10.10.10.10]
LOG: MAIN
Completed

works, but swapping the addresses produces the problem:

# exim -N bob@newdomain alice@newdomain < /dev/null
LOG: MAIN
<= root@local U=root P=local S=241
LOG: MAIN
*> userb@olddomain <bob@newdomain> R=localmux T=oldserver
H=10.10.10.10 [10.10.10.10]
LOG: MAIN
Completed

Alice does not get two or one copy, but none instead.

I have looked into the debug log, but I am not sure I can interpret them
correctly. My impression is: Exim goes through the addresses one at a
time and only routes them one step (until the next redirect router),
then eliminates dupes in the resulting list and goes on. This leads to
the following addresses and router passes:

0:  bob@newdomain                           alice@newdomain
      (alias)                                 (alias)
1: userb@local                             usera@local
      (forward)                               (oldserver)
2: userb@olddomain usera@olddomain <-dup-> usera@olddomain
      (alias)          (alias)                 discarded
3: userb@local     usera@local
      (oldserver)       (*)
4: userb@olddomain
      (localmux)


(*) At this point "usera@local" is discarded a second time. Is this
because the following router "oldserver" has already seen the address
(before it was discarded the first time in step 2)?. The debug message
does not indicate this explicitly:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Considering: userb@olddomain
unique = userb@olddomain
dbfn_read: key=R:olddomain
dbfn_read: key=R:userb@olddomain
no domain retry record
no address retry record
userb@olddomain: queued for routing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Considering: usera@local
unique = usera@local
usera@local is a duplicate address: discarded
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>


I have tried to come up with a simpler example which shows this
behaviour, but have failed so far. On the other hand, while I can easily
avoid the situation leading to the error (e.g by normalizing the
forwarding data outside the mail system) I am not sure that there are no
other cases where this can happen, so I am a bit uncomfortable with the
situation, even if Exim should perform as specified.

The problem could be avoided by changing Exim to postpone duplicate
elemination until after all routing is done, but I don't know what other
consequences this would have.
--
Harald