Re: [Exim] Exim 4.30 release

Top Page
Delete this message
Reply to this message
Author: Phil Pennock
Date:  
To: Exim Users
Subject: Re: [Exim] Exim 4.30 release
On 2003-12-02 at 09:37 +0000, Philip Hazel wrote:
> I have just put Exim release 4.30 on the primary ftp site:


Thank you for changelog items 56 (header fixups) and 62
($mailstore_basename), much appreciated. :^)

For reference:
-----------------------------< cut here >-------------------------------
26. Change 3.952/11 added an explicit directory sync on top of a file sync for
    Linux. It turns out that not all file systems support this. Apparently some
    versions of NFS do not. (It's rare to put Exim's spool on NFS, but people
    do it.) To cope with this, the error EINVAL, which means that sync-ing is
    not supported on the file descriptor, is now ignored when Exim is trying to
    sync a directory. This applies only to Linux.
-----------------------------< cut here >-------------------------------


We're actually one of these rare cases, although we use FreeBSD. In
case it's useful in assessing how people abuse your innocent code, I'll
explain why.

If a primary MX for which we're backup is not available, then the mail
will go into such a spool. This follows from a philosophy of local disk
being fine for transient messages -- critical disk error can only have
minimal impact -- but for any message which will linger for more than a
couple of minutes, we want it on resilient storage. For manageability
(& scaleability thereof) reasons, this means NetApp Filer NFS, not local
RAID.

We do use localhost_number; we don't have multiple hosts using that
spool, since FreeBSD 4 doesn't support NLM locking for NFS -- all locks
are local to the current machine. The most another machine in the
cluster will do is an "exim -bpc" over that spool. If there's a
hardware failure, we can repoint the front-end incoming.mail boxes to a
different backend for long-lived in-spool messages. We're awaiting a
more reliable FreeBSD 5 to change that (and not 5.2, judging by the
Release Engineering "Open Issues").

To be honest though, this is one of the places where in the medium-term
future we'll be deploying Sendmail instead of Exim -- queue management
features for larger queues. Leased-line customers who turn their
mail-machines off at weekends ... So in this sense, in the longer term
we'll stop using NFS for any internal spools on Exim.


One _really_ strange issue we've seen is when the whole of the
spool_directory area is in NFS, instead of just the input/ & msglog/
areas; I really didn't want to stick symlinks in there, but we had to
get db/ off. Strange issues with massively-expanding NFS read traffic.
I suspect that the DB files are sparse, but even that doesn't explain
some of the changes. It looks like (NetApp) NFS breaks Berkeley DB on
FreeBSD badly -- the sparseness gets lost (unsurprisingly) but the
retransmits go up and the nominal file size ("ls -l") balloons to sizes
of hundreds of gigabytes. Seeing traffic graphs of network I/O has some
interesting anomalies here; moving spool_directory itself to local disk
and pointing only input/ and msglog/ to NFS is working just fine.

-Phil "Fate-Tempter" Pennock,
--
2001: Blogging invented. Promises to change the way people bore strangers with
anal anecdotes about their pets. <http://www.thelemon.net/issues/timeline.php>