Re: [Exim] Retry rules/failing host

Author: Tim Jackson
Date:
To: exim-users
Subject: Re: [Exim] Retry rules/failing host

On Mon, 16 Dec 2002 21:01:12 +0000 (GMT) Philip wrote:

Thanks for the reply.

> On Mon, 16 Dec 2002, Tim Jackson wrote:
> > messages seem to be bouncing after only two days. For
> > example, a message received by the server for user@??? at
> > 05:48 on Saturday morning caused a permanent error message to be
> > generated at 05:48 this morning, with "No route to host: retry timeout
> > exceeded".
> Two days since what? Message arriving?

Yes, as above.

> That means nothing. Exim's retrying is *host* based, not message based.
> Your retry rule says"bounce stuff when the host has been down for 4
> days". I suspect that's what it is doing.

OK, understood. But messages (this isn't by any means a one-off) seem to
be bouncing consistently *exactly* (to the minute) two days after they are
received, which doesn't fit with this? (I believe what you are saying is
that this should only happen if the 4 day host-based cutoff occurs two
days after a particular message is received?) Since I'm using a
close-to-default config, delay_after_cutoff is presumably true (I haven't
set it false), and therefore they ought to be bounced immediately, which
isn't happening?

For example, I've just this minute sent a message to the failing address.
Since delay_after_cutoff is true, and the host has been down for more than
my max cutoff, the message should bounce immediately? But:

2002-12-16 22:40:37 18O3uf-0003Dv-00 <= me@???
H=myhost (myhost) [my.ip] P=esmtp S=729 id=blah
2002-12-16 22:40:37 18O3uf-0003Dv-00 == user@??? R=dnslookup
T=remote_smtp defer (-53): retry time not reached for any host

Having said that, looking at the output of exim_dumpdb, the most recent
entry (~4 days ago) was "connection refused" rather than "no route to
host", so I wonder if (contrary to what I understand to be the case), the
remote server has actually been intermittently up and down. But even if
that's the case, I still don't understand why lots of messages sent at
various times over a period of several days have been bouncing *exactly*
two days after receipt by the server in each case. It's impossible that in
every case, the 4 day cutoff has been reached *exactly* two days after
receipt of each message.

Also, in section 31.6 you mention calculation of post-cutoff retry times,
but the algorithm isn't mentioned. Is it rather complicated?

I should probably add that the situation in this case is simple:

failing.domain IN MX 5 unreachable.server.
failing.domain IN MX 10 my.exim.server.

domainlist relay_to_domains = <text file including failing.domain>

As is probably apparent, there's a low-level but constant stream of
messages inbound for this domain.

Thanks for the help! Even if Exim's acting 100% normally, I'd still like
to get to the bottom of this for my own understanding, since I'm
struggling to tie up what's happening with what the spec says should
happen.

Tim

This message is part of the following thread:
	the complete thread tree sorted by date
	Philip Hazel at
	Philip Hazel at