Re: Database corruption

Kezdőlap
Üzenet törlése
Válasz az üzenetre
Szerző: Philip Hazel
Dátum:  
Címzett: Pete Ashdown
CC: Exim Mailing List
Tárgy: Re: Database corruption
On Wed, 28 May 1997, Pete Ashdown wrote:

> It seems like almost once a week I need to remove the old databases to
> start fresh again. The program "exim_tidydb" generates a segfault when
> trying to access the database, and all queue mail is forever stuck.


You are, I presume, using Berkeley DB version 1.x? That seems to be in
use in all these corruption cases. Which operating system? Which
compiler? We are using Berkeley DB on Solaris 2.5 on some of our busier
mail systems and haven't had any problems at all, over months and
months. Not sure which compiler was used to compile it. David?

> Having a database that corrupts once a week sort of defeats the purpose
> of having a database. Is there any preventive maintenance I can do to stop
> this from happening? Am I completely screwed once it does happen? Can
> anything be done to the exim code to make sure that the database is good
> before proceeding?


The problem is that the only thing that can detect the corruption is the
DB library. It is a pity that it segfaults rather than giving some kind
of error return.

> It is rather unsettling that exim continues running as normal, happily
> chucking things into never-never-land (the corrupted queue). I would
> rather it shut down completely when the queue is not valid.


It is not the queue that is in the database. It is just delivery hints.
Unfortunately, Exim can't tell its hints are invalid - every time it tries
to open the DB file, the DB functions crash it. And there isn't a concept
of "shutting down" Exim, since it is not a single process.

I am going to do some work on the database code fairly soon. I am
contemplating using an entirely distinct file for locking, rather than
trying to lock on the file(s) used by the db library, because I suspect
that some of the trouble may be caused by updates happening to the files
on open(), before Exim has a chance to lock anything. If it locks on a
separate file before going anywhere near the real files, this shouldn't
happen, and also the problems that some people have seen when a db
library tries to do its own locking and this doesn't interwork with
Exim's own locking should go away. It should then be easier to use Exim
with gdbm. I'm also going to take a look at Berkeley db 2.0, which has
been recently released.

Philip

-- 
Philip Hazel                   University Computing Service,
ph10@???             New Museums Site, Cambridge CB2 3QG,
P.Hazel@???          England.  Phone: +44 1223 334714