On Wed, Jun 18, 2008 at 04:12:11PM +0200, Gergely Nagy wrote:
> I put up two reports (load, iostat, queue, ps ax) at
> http://195.70.33.28/~algernon/exim-stuff/
Quite interesting, because I fail to see what should drive the load up
this far, too.
But: The disk load is not spread well. sda/sdb get quite some write
operations and might be a limit soon. The rest has a good distribution
of ops. It might be interesting why the traffic is so different, but
it is not a problem at this time and likely won't be for quite a while.
> There are quite a few exims stuck in D, indeed, as expected. The
> question remains, though - why? Why is it taking that long to process a
> maildir, when my quickly hacked up perl script finished even the largest
> directory within 10 seconds (and most others in one).
Are they stuck in processing maildirs? Try stracing them to see what
exactly takes so long. That's most important to me.
> However, there's one idea I was thinking of: whether it is possible to
> give a transport a timeout, so if it does not finish within N seconds,
> it aborts, logs an error, and will get retried later on?
Bad idea. The started transport will cause page faults and I/O ops for
nothing if you abort it, thus increasing overall load.
One thing that comes to mind: Did you separate the spool from the
maildirs? That helps a lot. Try putting a number of spool directories on
single filesystems, no need to use RAID0/dm there, as Exim distributes the
load evenly. If you experience a bad ops distribution on the maildirs,
try creating the filesystem with a different group size, best one that
is prime to the chunk size of the underlying RAID groups, to distribute
group beginnings among all devices. That helps particularly with rather
empty filesystems, as they fill, the problem disappears naturally.
And finally, not related with your system: Unless you have a good reason
not to, use as large RAID stripes as you can, if you make use of the
whole disk anyway. Split operations have a higher cost, and small stripes
split them more often.
That's how a busy maildir store looks here (load 2-3 at this time):
avg-cpu: %user %nice %system %iowait %steal %idle
9.79 0.00 3.95 27.42 0.00 58.84
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 0.77 46.59 15.85 28.78 175.26 72.58 5.55 0.27 6.09 3.81 17.01
sdb 0.76 46.61 16.00 28.76 176.73 72.64 5.57 0.29 6.47 4.02 18.00
sdc 0.97 45.64 16.60 24.00 202.25 50.07 6.21 0.24 5.91 3.97 16.10
sdd 0.54 45.68 16.10 24.11 176.22 68.87 6.10 0.26 6.39 4.22 16.97
sde 0.98 42.70 16.25 23.29 199.05 20.87 5.56 0.28 7.04 4.20 16.59
sdf 0.54 42.75 16.13 23.39 177.13 39.68 5.49 0.28 6.98 4.25 16.82
sdg 1.03 40.45 16.73 23.62 203.21 5.53 5.17 0.29 7.13 4.17 16.85
sdh 0.60 40.49 16.36 23.73 178.61 24.34 5.06 0.30 7.53 4.36 17.46
sdi 0.73 40.03 16.27 23.43 179.06 0.69 4.53 0.26 6.60 4.14 16.43
sdj 0.74 40.05 16.18 23.41 178.31 0.69 4.52 0.28 6.97 4.28 16.93
It took me a while to balance things like that, but it is possible.
Michael