Re: [EXIM] Smart Host Timeout Problems

Top Page
Delete this message
Reply to this message
Author: Philip Hazel
Date:  
To: Lee McLoughlin
CC: exim-users
Subject: Re: [EXIM] Smart Host Timeout Problems
On Thu, 5 Mar 1998, Lee McLoughlin wrote:

> The smart mail machine here is jealousy.icparc.ic.ac.uk which doesn't appear
> in this output.


But your configuration lists it as mail.icparc.ic.ac.uk, which is a
CNAME. This explains the phenomenon. It is a bug which is fixed in the
next release of Exim. This is the ChangeLog entry:

4. If an existing SMTP connection was passed to a new Exim to deliver a
waiting message, and the host list was supplied in the transport, but for some
reason no host matched the connected host, Exim behaved as if all retry times
for all the hosts had expired, and incorrectly bounced the address.

Exim will have passed over the name jealousy.icparc.ic.ac.uk, but the
host that the address thinks it needs is mail.icparc.ic.ac.uk.

Below is a patch I produced for 1.82.

Philip

-- 
Philip Hazel                   University Computing Service,
ph10@???             New Museums Site, Cambridge CB2 3QG,
P.Hazel@???          England.  Phone: +44 1223 334714




*** exim-1.82/src/transports/smtp.c    Fri Dec 19 10:37:18 1997
--- src/transports/smtp.c    Tue Feb 10 12:37:44 1998
***************
*** 1419,1450 ****
    fallback_anchor = host;
    }


- /* If the queue_smtp flag is set, we don't actually want to do any
- deliveries. Instead, all addresses are to be deferred, and the hints
- as to which hosts they are waiting for must be set. Can't use the
- set_errno function here, as it only sets PENDING return values. */
- 
- if (queue_smtp)
-   {
-   address_item *addr;
-   transport_update_waiting(hostlist, tblock->name);
-   for (addr = addrlist; addr != NULL; addr = addr->next)
-     {
-     addr->transport_return = DEFER;
-     addr->basic_errno = 0;
-     addr->message = "queue_smtp option set";
-     }
-   DEBUG(2) debug_printf("Leaving %s transport: queue_smtp option set\n",
-     tblock->name);
-   if (expanded_hosts != NULL)
-     {
-     if (fallback_anchor != NULL) fallback_anchor->next = NULL;
-     free_hosts(hostlist, expanded_hosts);
-     }
-   return;
-   }
- 
- 
  /* Sort out the service, i.e. the port number. We want the port number in
  network byte order, and that's what getservbyname() produces, so we have
  to use htons() if the configuration specified a port by number instead of
--- 1425,1430 ----
***************
*** 1516,1522 ****
  time. After that, set the status and error data for any addresses that haven't
  had it set already. */


- 
  for (cutoff_retry = 0; expired &&
       cutoff_retry < ((ob->delay_after_cutoff)? 1 : 2);
       cutoff_retry++)
--- 1496,1501 ----
***************
*** 1529,1545 ****
      address_item *first_addr = NULL;
      char *retry_key = NULL;


-     /* If this is a continued delivery, we are interested only in the host
-     which matches the name of the existing open channel. */
- 
-     if (continue_hostname != NULL && strcmp(continue_hostname, host->name) != 0)
-       continue;
- 
-     /* Count hosts being considered - purely for an intelligent comment
-     if none are usable. */
- 
-     hosts_total++;
- 
      /* If the address hasn't yet been obtained from the host name, look it up
      now, unless the host is already marked as unusable at this time. If the
      "name" is in fact an IP address, just copy it over and check for being a
--- 1508,1513 ----
***************
*** 1616,1621 ****
--- 1584,1609 ----
          }
        }


+     /* If the queue_smtp option is set, we don't actually want to attempt
+     any deliveries. If this is a continued delivery, we are interested only
+     in the host which matches the name of the existing open channel. The check
+     is put here after the local host lookup, in case the name gets expanded
+     as a result of the lookup. Set expired FALSE in both cases, to save the
+     outer loop executing twice. */
+ 
+     if (queue_smtp ||
+         (continue_hostname != NULL &&
+          strcmp(continue_hostname, host->name) != 0))
+       {
+       expired = FALSE;
+       continue;
+       }
+ 
+     /* Count hosts being considered - purely for an intelligent comment
+     if none are usable. */
+ 
+     hosts_total++;
+ 
      /* The first time round the outer loop, check the status of the host by
      inspecting the retry data. The second time round, we are interested only
      in expired hosts that haven't been tried since this message arrived. */
***************
*** 1846,1865 ****
  /* Get here if all IP addresses are skipped or defer at least one address. Add
  a standard message to each deferred address if there hasn't been an error, that
  is, if it hasn't actually been tried this time. The variable "expired" will be
! TRUE unless at least one address was not expired. However, if
! ob->delay_after_cutoff is FALSE, some of these expired hosts might have been
! tried. If so, an error code will be set, and the failing of the message is
! handled by the retry code later. */


  for (addr = addrlist; addr != NULL; addr = addr->next)
    {
!   if (addr->transport_return == DEFER &&
         (addr->basic_errno == ERRNO_UNKNOWNERROR || addr->basic_errno == 0) &&
         addr->message == NULL)
      {
      addr->basic_errno = ERRNO_HRETRY;
!     if (expired)
        {
        addr->message = (ob->delay_after_cutoff)?
          "retry time not reached for any host after a long failure period" :
          "all hosts have been failing for a long time and were last tried "
--- 1834,1868 ----
  /* Get here if all IP addresses are skipped or defer at least one address. Add
  a standard message to each deferred address if there hasn't been an error, that
  is, if it hasn't actually been tried this time. The variable "expired" will be
! TRUE unless at least one address was not expired, except in one special case
! (see below). However, if ob->delay_after_cutoff is FALSE, some of these expired
! hosts might have been tried. If so, an error code will be set, and the failing
! of the message is handled by the retry code later.


+ If queue_smtp is set, or this transport was called to send a subsequent message
+ down an existing TCP/IP connection, and something caused the host not to be
+ found, we end up here, but can detect these cases and handle them specially. */
+ 
  for (addr = addrlist; addr != NULL; addr = addr->next)
    {
!   if (queue_smtp)    /* no deliveries attempted */
!     {
!     addr->transport_return = DEFER;
!     addr->basic_errno = 0;
!     addr->message = "queue_smtp option set";
!     }
! 
!   else if (addr->transport_return == DEFER &&
         (addr->basic_errno == ERRNO_UNKNOWNERROR || addr->basic_errno == 0) &&
         addr->message == NULL)
      {
      addr->basic_errno = ERRNO_HRETRY;
!     if (continue_hostname != NULL)
        {
+       addr->message = "no host found for existing SMTP connection";
+       }
+     else if (expired)
+       {
        addr->message = (ob->delay_after_cutoff)?
          "retry time not reached for any host after a long failure period" :
          "all hosts have been failing for a long time and were last tried "



--
*** Exim information can be found at http://www.exim.org/ ***