These machines have recently been upgraded, 3days ago. New hardware and
OS version. They replaced machines that were 3yrs old that had been
running the same configuration for that time without a problem.
I have indeed fixed the problem by reverting to a normal -q20m but
thanks for the info. I think the problem may have been caused by
FreeBSD 7.1 being far better at handling multi cpu systems and process
running into each other (big guess).
These servers are dumps that handle failed mail. So the cron job was a
brutal way of trying one last delivery and far quicker than -q20m when
you know the queue is going to be large and very slow.
David
W B Hacker wrote:
> Mr David Robertson wrote:
>> Good suggestion. Did a diff. Unfortunately there are no major
>> differences.
>> Just the things you would expect. hostname etc
>>
>> David
>
> Graeme's suggestion w/r clearing out the hints has helped me in similar
> odd circumstances. Average need has been less than once per three years,
> though.
>
> I'd also be tempted to browse the queue, for which lynx or links are
> handy over ssh, as they 'navigate' a durn sight more comfortably than
> 'ls -l' and 'less'.
>
> And - not to put too fine a point on it, but why use a chroned
> queue-runner if the interval is only 2 minutes anyway?
>
> I have run;
>
> exim -bd -q55s
>
> ..for ages - the sort interval only because we are 10% Maildir and
> IMAP-only, so intra-inter-office traffic is as fast as IRC.
>
> -q2m should obviate the need or any chron working, and a single 'master'
> Exim is less likely to collide with itself than an external invocation
> marching to the beat of a different drummer.
>
> HTH,
>
> Bill
>
>> W B Hacker wrote:
>>> Mr David Robertson wrote:
>>>> Thanks for the suggestion. But I've already tried that.
>>>> Stops errors for a minute or two but they do return.
>>>> Both servers are using the same versions of exim and bdb.
>>>> db41-4.1.25_4
>>>> Yet one errors and the other does not.
>>>>
>>>> David
>>>>
>>> Have you tried diff' ing their ~/configure files?
>>>
>>> Bill
>>>
>>>
>>>> Graeme Fowler wrote:
>>>>> On Tue, 2009-02-17 at 11:46 +0000, Mr David Robertson wrote:
>>>>>> Failed to get write lock for /var/spool/exim/db/retry.lockfile
>>>>> If you Google for that you'll find several suggestions, most of which
>>>>> indicate that something has either corrupted the DB file or the
>>>>> system's
>>>>> DB library has been changed, or has a bug.
>>>>>
>>>>> If you stop exim and remove all the files (yes, all of them)
>>>>> from /var/spool/exim/db, restart exim again - what happens?
>>>>>
>>>>> Graeme
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>>
>>>>>
>>>>> No virus found in this incoming message.
>>>>> Checked by AVG - www.avg.com Version: 8.0.237 / Virus Database:
>>>>> 270.10.25/1956 - Release Date: 02/16/09 18:31:00
>>>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG - www.avg.com Version: 8.0.237 / Virus Database:
>>> 270.10.25/1956 - Release Date: 02/16/09 18:31:00
>>>
>
>
>
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.0.237 / Virus Database: 270.10.25/1956 - Release Date: 02/16/09 18:31:00
>