Re: [exim] Bug in duplicate detection?

Top Page
Delete this message
Reply to this message
Author: Philip Hazel
Date:  
To: exim-users
CC: Harald Schueler
Old-Topics: Re: [exim] Bug in duplicate detection?
Subject: Re: [exim] Bug in duplicate detection?
On Tue, 18 Jan 2005, I wrote:

> The whole area of handling duplicates is quite tricky, I seem to recall
> from previous work on the routing code. The original premise was "if
> this address has previously been routed, then routing it again will give
> the same result". I suspect that the facilities that have been
> subsquently added to Exim, such as header addition and the preservation
> of $address_data and the like, may have broken the underlying
> assumption. However, making any changes would amount to a complete
> routing re-design, I suspect.


Good news! I hope.

The paragraph above is correct, but the final sentence was overly
pessimistic. Having caught up on other stuff, I finally took a more
detailed look at this this afternoon, and the solution turned out to be
remarkably simple. At least I hope so. I'm a bit worried that I have
broken something, so I would be grateful if people could test what I
have done. All the existing tests run clean, and I've added some new
ones for the bug that provoked this thread.

The particular patch for this bug is below. I have also made a new
snapshot in:

ftp://ftp.csx.cam.ac.uk/pub/software/email/exim/Testing/exim-snapshot.tar.gz
ftp://ftp.csx.cam.ac.uk/pub/software/email/exim/Testing/exim-snapshot.tar.gz.sig

with all the latest changes (as always, look at doc/ChangeLog). I would
be grateful if people could test this version as much as possible. In
particular, tests that it handles duplicate addresses correctly will
give me more confidence. When I get back from Africa I want to make it
the basis for a 4.51 release. Of course, there may be other issues that
crop up in the meantime, so there may still be some changes.

Philip

-- 
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.



*** exim-4.50/src/deliver.c Thu Feb 17 14:49:11 2005
--- deliver.c    Thu Apr  7 16:40:50 2005
***************
*** 4251,4256 ****
--- 4274,4331 ----




+ /*************************************************
+ *     Check list of addresses for duplication    *
+ *************************************************/
+ 
+ /* This function was introduced when the test for duplicate addresses that are
+ not pipes, files, or autoreplies was moved from the middle of routing to when
+ routing was complete. That was to fix obscure cases when the routing history
+ affects the subsequent routing of identical addresses. If that change has to be
+ reversed, this function is no longer needed. For a while, the old code that was
+ affected by this change is commented with !!!OLD-DE-DUP!!! so it can be found
+ easily.
+ 
+ This function is called after routing, to check that the final routed addresses
+ are not duplicates. If we detect a duplicate, we remember what it is a
+ duplicate of. Note that pipe, file, and autoreply de-duplication is handled
+ during routing, so we must leave such "addresses" alone here, as otherwise they
+ will incorrectly be discarded.
+ 
+ Argument:     address of list anchor
+ Returns:      nothing
+ */
+ 
+ static void
+ do_duplicate_check(address_item **anchor)
+ {
+ address_item *addr;
+ while ((addr = *anchor) != NULL)
+   {
+   tree_node *tnode;
+   if (testflag(addr, af_pfr))
+     {
+     anchor = &(addr->next);
+     }
+   else if ((tnode = tree_search(tree_duplicates, addr->unique)) != NULL)
+     {
+     DEBUG(D_deliver|D_route)
+       debug_printf("%s is a duplicate address: discarded\n", addr->unique);
+     *anchor = addr->next;
+     addr->dupof = tnode->data.ptr;
+     addr->next = addr_duplicate;
+     addr_duplicate = addr;
+     }
+   else
+     {
+     tree_add_duplicate(addr->unique, addr);
+     anchor = &(addr->next);
+     }
+   }
+ }
+ 
+ 
+ 


  /*************************************************
  *              Deliver one message               *
***************
*** 5265,5270 ****
--- 5342,5359 ----
        continue;
        }


+ 
+     /* !!!OLD-DE-DUP!!!  We used to test for duplicates at this point, in order
+     to save effort on routing duplicate addresses. However, facilities have
+     been added to Exim so that now two identical addresses that are children of
+     other addresses may be routed differently as a result of their previous
+     routing history. For example, different redirect routers may have given
+     them different redirect_router values, but there are other cases too.
+     Therefore, tests for duplicates now take place when routing is complete.
+     This is the old code, kept for a while for the record, and in case this
+     radical change has to be backed out for some reason. */
+ 
+     #ifdef NEVER
      /* If it's a duplicate, remember what it's a duplicate of */


      if ((tnode = tree_search(tree_duplicates, addr->unique)) != NULL)
***************
*** 5280,5285 ****
--- 5369,5377 ----
      /* Record this address, so subsequent duplicates get picked up. */


      tree_add_duplicate(addr->unique, addr);
+     #endif
+ 
+ 


      /* Get the routing retry status, saving the two retry keys (with and
      without the local part) for subsequent use. Ignore retry records that
***************
*** 5592,5597 ****
--- 5684,5704 ----
  local_user_gid = (gid_t)(-1);
  local_user_uid = (uid_t)(-1);


+
+ /* !!!OLD-DE-DUP!!! The next two statement were introduced when checking for
+ duplicates was moved from within routing to afterwards. If that change has to
+ be backed out, they should be removed. */
+
+ /* Check for any duplicate addresses. This check is delayed until after
+ routing, because the flexibility of the routing configuration means that
+ identical addresses with different parentage may end up being redirected to
+ different addresses. Checking for duplicates too early (as we previously used
+ to) makes this kind of thing not work. */
+
+ do_duplicate_check(&addr_local);
+ do_duplicate_check(&addr_remote);
+
+
/* When acting as an MUA wrapper, we proceed only if all addresses route to a
remote transport. The check that they all end up in one transaction happens in
the do_remote_deliveries() function. */