Re: [exim] Failed to get write lock

Top Page
Delete this message
Reply to this message
Author: W B Hacker
Date:  
To: exim users
Subject: Re: [exim] Failed to get write lock
Mr David Robertson wrote:
> We log all failed addresses and after 3 fails add the address to our bad
> address list, so delivery is not attempted again. Not certain about
> success percentages but they are pretty good I think because of the
> above. We also have some basic retry rules in place to cover
> graylisting, other defers, failures and everything times out after 2 days.
>
> I would be very interested in any tips you have regarding retry rules.
>
> David


We are probably 'odd man out' on retry, but believe that:

-- while smtp isn't supposed to be reliable or guaranteed in theory, in
practice it works so well, and so often, that end-users believe it to be
as certain as such things get.

-- accordingly, we may cease retry in as little as 13 minutes for some
servers.

That allows the sender of a failed message a fighting chance to correct
a bad address, ELSE, if important enough, pick up the phone or send a
fax before scarpering off for a 3-day weekend or such.

In any case - to be aware that the message did NOT make it while they
still remember what they sent and why.

Finding that out 3 or 4 days later is useless.

Mind - it is near-as-dammit a transparent setting, as most mail gets
thru in one go, and whatever is going to fail will almost always get a
5XX hard fail in the first minute or two.

That which hits a 'defer' will get thru by the 4th attempt if not the
second, ELSE never. Given the prevalence of bad-greylisting
implementations, first retry is probably best made at no less than the
five minute mark, give up and quit no more than twenty minutes.

CAVEAT: As said - we're odd-man out.

But even a 50-baud torn-tape operation didn't need 3 days to report a
failure...

HTH,

Bill



>
> W B Hacker wrote:
>> Mr David Robertson wrote:
>>> These machines have recently been upgraded, 3days ago. New hardware and
>>> OS version. They replaced machines that were 3yrs old that had been
>>> running the same configuration for that time without a problem.
>>>
>>> I have indeed fixed the problem by reverting to a normal -q20m but
>>> thanks for the info. I think the problem may have been caused by
>>> FreeBSD 7.1 being far better at handling multi cpu systems and process
>>> running into each other (big guess).
>>>
>>> These servers are dumps that handle failed mail. So the cron job was a
>>> brutal way of trying one last delivery and far quicker than -q20m when
>>> you know the queue is going to be large and very slow.
>>>
>>> David
>>>
>> I'd be interested in what percentage of previously-failed traffic they
>> manage to eventually deliver - and why that portion failed on the primary.
>>
>> Our retry rules take care of 'legitimate' greylisting w/o much load, so
>> the only thing we ordinarily see as ND's are the odd 'tmda' or sputnik -
>> left for the individual sender to deal with, or NOT.
>>
>> All else is not *ever* going to get thru - 'least not 'til a mis-spelled
>> address is manually corrected.
>>
>> Bill
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>> No virus found in this incoming message.
>> Checked by AVG - www.avg.com
>> Version: 8.0.237 / Virus Database: 270.10.25/1956 - Release Date: 02/16/09 18:31:00
>>
>