Re: [exim] Retry time not reached for any host after a long failure period

Author: Jens Hoffrichter
Date:
To: exim-users
Subject: Re: [exim] Retry time not reached for any host after a long failure period

Hello,

Sorry that it took me so long to reply, but I was out of the office
for a couple of days and had no time to look at that particular issue.

2008/8/28 Tony Finch <dot@???>:

>> The real strange thing is, that delivery to both of the mentioned
>> hosts in there worked fine before and after, just that message (and a
>> couple more) got bounced. One thing I noticed, though, is the fact
>> that it only happened to mails which had more than one RCPT TO:
>> (newsletters which got delivered to the mailsystem, mainly)
>
> What does exinext say about your Cyrus servers?
> Are you setting retry_use_local_part where appropriate?
> Can you reproduce the problem with Exim in debugging mode?
I was actually able to reproduce the problem in exim debugging mode,
and fixed it by now.

The problem was in the remote_lmtp_delivery transport

remote_lmtp_delivery:
driver = smtp
protocol = lmtp
port = 2003
hosts = $domain
gethostbyname = true

Here the host to which the mail should have been delivered to was
expanded from the domain the recipient was rewritten to in the
loadbalancer router, to a form like uid@backend

This generally works fine, if there is only one recipient, or if the
recipients all reside on the same backend.

But if a mail should be delivered for recipients on different
backends, the line:

hosts = $domain

expanded to

hosts =

leaving the recipient host empty. So the transport failed, and bounced
the message.

Now I'm setting the backend host already in the router, so different
delivery processes are spawned for each backend, and it works without
the problem. The modified router and transport looks like this:

loadbalancer_final:
driver = manualroute
condition = ${if def:address_data{yes}{no} }
transport = remote_lmtp_delivery
route_data = $domain
self = send

begin transports

remote_lmtp_delivery:
driver = smtp
protocol = lmtp
port = 2003

The loadbalancer router is not changed, that is the same as before.

What this problem made so hard to debug was the error message with the
retry time - in my opinion that is totally unrelated to the config
error which occured there. I was only able to pinpoint the behaviour
in debug mode, and after looking at mail traffic for an hour or so.

Maybe some sort of different error message could be generated if the
delivery host in an smtp transport is set empty, or expands to empty?
That would certainly help, and would make debugging of this error a
lot easier.

Unfortunately, I'm not knowledgable enough with the exim source code
to write a patch myself, but I hope you might consider doing that.

Thanks for the help, the hint with looking at exim in debug mode really helped.

Regards,
Jens

This message is part of the following thread:
	the complete thread tree sorted by date
	Tony Finch at
	Tony Finch at

Re: [exim] Retry time not reached for any host after a long …