[exim-dev] [Bug 1874] New: Exim 4.87 silently losing email t…

Top Page
Delete this message
Reply to this message
Author: admin
Date:  
To: exim-dev
New-Topics: [exim-dev] [Bug 1874] Exim 4.87 failing to retry lmtp deliveries
Subject: [exim-dev] [Bug 1874] New: Exim 4.87 silently losing email to our local mail stores (failing to retry lmtp deliveries)
https://bugs.exim.org/show_bug.cgi?id=1874

            Bug ID: 1874
           Summary: Exim 4.87 silently losing email to our local mail
                    stores (failing to retry lmtp deliveries)
           Product: Exim
           Version: 4.87
          Hardware: x86
                OS: Linux
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Delivery in general
          Assignee: nigel@???
          Reporter: dpc22@???
                CC: exim-dev@???


Created attachment 917
--> https://bugs.exim.org/attachment.cgi?id=917&action=edit
Log entries demonstrating lost email.

I upgraded Exim on our central mail hubs from a snapshot of
https://github.com/Exim/exim.git which was deployed by my colleague Tony Finch
on 24/08/2015 (which he tagged "4.86_36-e07b163") to the latest stable version,
Exim 4.87, on 03/08/2016. No change in configuration.

9 days later we received a report that email was no longer being delivered
consistently to all members of a given mailing list. Investigation revealed
that a small fraction of our local email was losing recipients. 911 messages
were affected over a 9 day period.

On Saturday I reverted back to "4.86_36-e07b163", again no change in
configuration. The problem vanished.

The root cause appears to be a serialize_hosts limit that we place on LMTP
delivery to our Cyrus mailstores in order to spread out load:

hermes_lmtp:
  driver                = smtp
  protocol              = lmtp
  gethostbyname         = true
  rcpt_include_affixes  = true
  serialize_hosts       = *
  address_retry_include_sender = false


As a specific example, the following delivery attempt was not retried:

  2016-08-12 03:25:09 +0100 1bY29l-0007xg-QF
     == fanf2@??? <fanf2@???> R=hermes_lmtp T=hermes_lmtp
     defer (-53): connection limit reached for all hosts


I attach the complete Exim logs for that message. Observe that the message is
only delivered to one of four recipients and then completes immediately without
any retry attempt.

I am not sure if this is related to bug 1788:

Some deliveries not re-attempted after 'T=remote_smtp defer (-53):
retry time not reached for any host' on first delivery attempt

"connection limit reached" feels different from "retry time not reached". I
will let you decide if the two bugs should be merged.

--
You are receiving this mail because:
You are on the CC list for the bug.