RE: [Exim] /var/spool/exim/db/retry.lockfile problems

Kezdőlap
Üzenet törlése
Válasz az üzenetre
Szerző: Smith, A.D.
Dátum:  
Címzett: exim-users
Tárgy: RE: [Exim] /var/spool/exim/db/retry.lockfile problems
# exim -bV
Exim version 4.10 #6 built 07-Mar-2003 12:52:11
Copyright (c) University of Cambridge 2002

This system is being run as a mail gateway, so everything is local and deliveries are smtps to (primarily) local mail hosts, which haven't changed their load since placing the existing configuration in service


-----Original Message-----
From: Philip Hazel [mailto:ph10@cus.cam.ac.uk]
Sent: Tuesday, March 11, 2003 2:04 PM
To: Smith, A.D.
Cc: exim-users@???
Subject: Re: [Exim] /var/spool/exim/db/retry.lockfile problems


On Tue, 11 Mar 2003, Smith, A.D. wrote:

> I keep seeing messages in my mainlog:
> Failed to get write lock for

/var/spool/exim/db/retry.lockfile: timed out
>
> if I do ls -ld /var/spool/exim/db:
> drwxrwx--T   2 mm       exim         512 Mar 11 10:34

/var/spool/exim/db
> and ls -l /var/spool/exim/db/
> -rw-r-----   1 mm       exim           0 Mar 11 10:34 retry.dir
> -rw-r-----   1 mm       exim           0 Mar 11 10:34 retry.lockfile
> -rw-r-----   1 mm       exim        1024 Mar 11 10:37 retry.pag
> -rw-r-----   1 mm       exim           0 Mar 11 10:34 wait-smtp.dir
> -rw-r-----   1 mm       exim           0 Mar 11 10:34

wait-smtp.lockfile
> -rw-r-----   1 mm       exim        1024 Mar 11 10:37 wait-smtp.pag

>
> I've looked through the exim docs and the archives, but I

can't seem to see any problems getting a write file lock for
retry.lockfile.
> This problem does not occur when the system is not under heavy load.
> The load quickly goes from 0.19 to in excess of 4.0 when this happens


Exim does try to be as quick as it can when updating the retry database,
which is the only time it needs an exclusive (write) lock. If mail is
all moving without any problems, it shouldn't need to update it at all.
Therefore, if there is a lot of updating going on, so much so that there
is lock contention, it must mean that there are a lot of temporary
delivery failures happening.

Hmm. I suppose if there were masses of processes getting read locks,
that might delay the one process that was trying to get a write lock.

I presume that you have /var/spool/exim on a local disc?

The timeout is one minute, which I thought should be ample. What DB
library are you using? ("exim -bV" will tell you.)

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.