Re: [exim] Valid working host "failing for a long time"

Top Page
Delete this message
Reply to this message
Author: Colin
Date:  
To: exim-users
Subject: Re: [exim] Valid working host "failing for a long time"
On 21/03/2012 11:39, Phil Pennock wrote:
> On 2011-08-25 at 10:30 +0100, Colin wrote:
>> On 25/08/2011 10:01, Graeme Fowler wrote:
>>> Please try two things: Firstly, run "exim_tidydb /path/to/spool retry"
>>> and see what happens, if the problem continues then simply stop Exim
>>> and *remove* the retry DB file altogether (it's only a hints file so
>>> is safe to delete). It strikes me that there may be some stale data
>>> lying around. Alternatively it could be related to a very old bug in
>>> Exim's retry handling but that was wrinkled out a long time ago... Graeme
>> Thanks for the reply,
>>
>> Shortly after my last email I actually used exim_dumpdb to check if
>> there were any entries in any of the databases that related to the
>> domains in question. The only entries were in the callout db andnone
>> seemed relevant to the error.
>>
>> The only oddity I came across was that there was no misc.lockfile
>> present to dump that db though I could create one (removed after) and
>> the dump returned nothing.
>>
>> I am running Exim version 4.69 so I wouldn't think there would be any
>> old bugs knocking around, the OS is Centos 5 so it is the latest out of
>> the packages.
> Resurrecting this thread, because I think that I know part of what's
> happening.
>
> Exim honours the retry database for any delivery happening as part of a
> queue-run. It's only "immediate" delivery which bypasses the retry
> rules and goes straight through, clearing DB problems.
>
> So if you have any mails matching a queue_only directive, then the mail
> will not be immediately delivered. When I saw this problem as a
> postmaster, I was queuing all mail and kicking off deliveries once per
> minute, which let connection re-use work really well.
>
> In your case, if you cross over something like queue_only_load, then
> mails received at that time will go into the queue.
>
> Which separately raises the issue of why the successful deliveries
> aren't clearing the DB state, but does explain why this would only be
> happening to some mails.
>
> It's so obvious in retrospect, with those specific delivery details
> called out. *sigh*
>
> It's _possible_ that Exim needs to change from "retry rules honoured for
> all messages found via queue-runs" to "retry rules honoured for all
> messages found via queue-runs, unless it would be the first delivery
> attempt for this mail". There are profound load implications to
> changing this, so I'm not going to rush into it. Perhaps a new config
> option ...
>
> -Phil


Scrap my previous message, I re-read the very first email of the thread
and remembered a bit more. The solution was to simplify the transport
and I think the solution came from a post on the list or an email from a
list member.

Most of the info in the transport was redundant so cutting it down to
this cleared out the problem:

remote_smtp_smart:
driver = smtp
delay_after_cutoff = false