Re: [Exim] Retry rules/failing host

Top Page
Delete this message
Reply to this message
Author: Philip Hazel
Date:  
To: Tim Jackson
CC: exim-users
Subject: Re: [Exim] Retry rules/failing host
On Mon, 16 Dec 2002, Tim Jackson wrote:

> OK, understood. But messages (this isn't by any means a one-off) seem to
> be bouncing consistently *exactly* (to the minute) two days after they are
> received, which doesn't fit with this?


Definitely not, especially the bit about "exactly to the minute". Exim
just doesn't work like that. The only way it could be doing that is if
you have some external process that is forcing deliveries exactly at
fixed times. It seems to me to be too much of a coincidence for a normal
queue runner to happen to hit this so closely so consistently.

> (I believe what you are saying is that this should only happen if the
> 4 day host-based cutoff occurs two days after a particular message is
> received?)


Yes.

> Since I'm using a close-to-default config, delay_after_cutoff is
> presumably true (I haven't set it false), and therefore they ought to
> be bounced immediately, which isn't happening?


If the host has been dead for 4 days, then yes, they ought to be bounced
immediately.

> For example, I've just this minute sent a message to the failing address.
> Since delay_after_cutoff is true, and the host has been down for more than
> my max cutoff, the message should bounce immediately? But:
>
> 2002-12-16 22:40:37 18O3uf-0003Dv-00 <= me@???
> H=myhost (myhost) [my.ip] P=esmtp S=729 id=blah
> 2002-12-16 22:40:37 18O3uf-0003Dv-00 == user@??? R=dnslookup
> T=remote_smtp defer (-53): retry time not reached for any host


Try a delivery with debugging, so see what's going on:

exim -d -M 18O3uf-0003Dv-00

> Having said that, looking at the output of exim_dumpdb, the most recent
> entry (~4 days ago) was "connection refused" rather than "no route to
> host", so I wonder if (contrary to what I understand to be the case), the
> remote server has actually been intermittently up and down.


In that case, though, the dumdb output should show a "first failed" time
less than 4 days ago.

> But even if
> that's the case, I still don't understand why lots of messages sent at
> various times over a period of several days have been bouncing *exactly*
> two days after receipt by the server in each case. It's impossible that in
> every case, the 4 day cutoff has been reached *exactly* two days after
> receipt of each message.


Quite. I am totally mystified.

> Also, in section 31.6 you mention calculation of post-cutoff retry times,
> but the algorithm isn't mentioned. Is it rather complicated?


No - it just continues to use the last retry algorithm.

> Thanks for the help! Even if Exim's acting 100% normally, I'd still like
> to get to the bottom of this for my own understanding, since I'm
> struggling to tie up what's happening with what the spec says should
> happen.


Me too. The whole area of retrying has been modified and worked over a
lot since the early days, and obscure cases do crop up from time to
time, so it's always worth investigating.

Philip

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.