Author: Theo E. Schlossnagle Date: To: exim-users, linux-xfs Subject: [Exim] Exim and XFS filesystem
Hello all,
We have been running exim for a while and we run it on over 75 machines
(Linux, BSDs Solaris). We recently started using SGI's xfs filesystem for
most of our operations because of its speed and stability -- we are _very_
happy with it. I have never had any problems with it ... until now.
Exim v3.14,v3.22,v3.33 and Linux 2.4.2-xfs. The xfs parition in question is
running atop a RAID-1 md device on two 9GB scsi drives.
After running Exim with its spool directory on an xfs partition and under low
load (100 messages/minute) I would soon get an Exim process spinning CPU bound
and I could not kill it [kill -9 did nothing]. The system was stuck on disk
writes (so any process that calls fsync or friends would get stuck in the run
queue never to come out again.) No modified files were writted to disk (by
any process) after this point. A reboot was required and restore "normal"
operation.
We tried many things to fix this with no success, but as soon as we configured
exim to use a non xfs (ext2 in this case) mounted spool directory, the problem
instantly disappeared.
It looked as if the kernel had a thread stuck writing to or reading from the
filesystem journal. If anyone knows a solution to this problem, I am all
ears. Otherwise, steer clear of running you Exim spools on xfs.
--
Theo Schlossnagle
1024D/82844984/95FD 30F1 489E 4613 F22E 491A 7E88 364C 8284 4984
2047R/33131B65/71 F7 95 64 49 76 5D BA 3D 90 B9 9F BE 27 24 E7