Re: [Exim] Reliability of spool/delivery handling (Linux)?

Top Page
Delete this message
Reply to this message
Author: Dr Andrew C Aitchison
Date:  
To: Philip Hazel
CC: Lutz Pressler, exim-users
Subject: Re: [Exim] Reliability of spool/delivery handling (Linux)?
> On Tue, 21 Aug 2001, Lutz Pressler wrote:
>
> > Is the way Exim handles spool files and local delivery safe, especially
> > on Linux ext2?
> > James Antill seems to think that it's not.
>
> I am not an expert on this stuff. I just have to go by what I read in
> the manuals and what people tell me.
>
> > Just in case anyone cares here's what exim does (AFAICS)...
> >
> > int fd1 = open(f1);
> > write(fd1);
> > fsync(fd1);
> >
> > int fd2 = open(tmp);
> > write(fd2);
> > fsync(fd2);
> > rename(tmp, f2); // Good at this point.
> >
> > So that seems to rely on all dir operations being sync.


On Tue, 21 Aug 2001, Philip Hazel wrote:
> I'm afraid I don't have enough knowledge/understanding to follow how or
> why that is relying on whatever it is that is worrying people. My
> understanding of rename() is that it is atomic, that is, from the point
> of view of other processes, either it has happened or it has not. There
> is never a halfway state when neither the old file nor the new exists.
>
> If there really is a problem, please can somebody explain it to me in
> more detail?


I followed the sample fix http://www.and.org/exim-3.31-dirfsync.patch
well enough to make some guesses.

While fsync flushes the file to disk and rename is atomic from the
point of view of other processes, I summize that rename is *not* atomic
below the filesystem level. There is a window during which the
all process will be told that the rename has happened, but the
disk thinks that the rename has not.

Should the machine crash during this window, the file will
have no name, and thus appear in lost+found. Or something.

-- 
Dr. Andrew C. Aitchison        Computer Officer, DPMMS, Cambridge
A.C.Aitchison@???    http://www.dpmms.cam.ac.uk/~werdna