Re: [Exim] Performance bottleneck scanning large spools.

Top Page
Delete this message
Reply to this message
Author: Gary Palmer
Date:  
To: Philip Hazel
CC: Kai Henningsen, exim-users
Subject: Re: [Exim] Performance bottleneck scanning large spools.
Philip Hazel wrote in message ID
<Pine.SOL.3.96.1000112093538.11421B-100000@???>:
> That's an interesting idea. Clearly it would be easy to set up a queue-
> runner that stopped scanning the directory when it has 1000 messages.
> However, the next one, started a minute later, would probably get more
> or less the same 1000 messages - but would that really matter?


Thats mostly a matter of tuning. You can tune the interval between
rescans for queue work, and you can tune how much work is pulled from
the queue. It very much depends on where you are delivering to. Most
of my mail is inbound (i.e. >1.5 million messages/day) , and we use
exim to route the inbound mail to the correct backend mailstore. In
this case, if you build a queue, then you can probably flush it pretty
quickly when the fault is corrected. Since both the front end boxes
and the backends are multi-processor, just running one queue runner
wastes time, and you can get faster delivery by running parallel
deliveries.

That is obviously a radically different situation from the default
where exim is contacting remote servers (i.e. outbound smtp), which
are down a lot of the time and therefore a fair percentage of the
queued messages don't go away. Sendmails answer to that (and exim
already does it in its own way) was the MinQueueAge parameter, which
stopped sendmail from repeatedly trying to deliver mail for which a
recent delivery failed. Exim does this in a much more elegant fashion
with its retry rules.

The net effect of that was that although you might pull repeat work
off the queue every time you ran the queue, you'd quickly skip over it
and get on with real work. This proved to be an effective method of
queue management and also provided for means for quick recovery from a
failure. As long as you had queue runners that walked the full queue
every so often to deliver mail that sits at the tail and would
otherwise get ignored, the MaxQueueRunSize parameter proved extremely
useful.

> They would both be doing deliveries. What *would* be a problem would
> be the case where the first 1000 messages obtained from the
> directory scan all failed to deliver. No queue-runner would then
> ever see any other messages.


Yes, that is definately a problem, and there is no good solution to
that problem as far as I can see, unless you don't put the work onto
your list of 1000 messages to be tried for delivery unless the retry
database says that the remote domain/host is ready for another shot
according to the retry rules. Sendmail didn't do that, as far as I
recall, and frequently built up a list of 1000 messages just to skip
over a number of them as being recently tried and undeliverable (as
decided by the MinQueueAge parameter).

I think exim has an elegant way out of the situation by using the
retry database, assuming that the DB lookup isn't too expensive to be
done during the queue scan itself rather than once the scan list has
been built. I would imagine the problem there being that you have to
do mail routing for the message before you can look in the retry
database (i.e. dns lookups, rewrites, etc). All of those operations
can (and often do) block, and would just slow down the queue scanner.

The way we got around this problem in sendmail was creative use of
cronjobs. Rather than just use a -q<time> parameter, we actually
totally ignored it and scheduled all queue runners from cron. This
allowed us to run a limited queue every couple of minutes and (say)
every half hour fork off a queue runner without any limits in it.
Gross, yes, but it worked, and to be honest I think that having all
messages in your <work size> chunk be undeliverable is an absolute
worst case scenario, and in all likelyhood if that is the case then
there probably isn't a whole lot of sense in running a full queue as
I'd bet most of them would be undeliverable.

(can you tell that I got fed up getting creative with sendmail and
much prefer exim? :) )

> > Philip, perhaps its possible to have a ``oh fsck'' mode for exim where
> > rather than scan the entire queue, you could treat the split spool as
> > 62 individual queues and run one directory, then go on to the next?


> That is similar to another suggestion of having a queue-runner just
> look at one sub-directory. The problem I see there is how you set up
> these queue-runners. Unless you are happy to fork 62 processes in
> parallel every time, that is.


Forking 62 processes is nothing. Having 62 processes try and beat the
kernel and the disk into `submission' by scanning 62 directories at
the same time would probably not be the best idea in the world. I can
imagine a number of ways that even well tuned kernels would just give
up and thrash to death in that scenario.

> Doing it in one queue-runner, a directory at a time, as you suggest,
> is conceptually easier. Thanks for that idea. It could even be made
> the standard way that queue-runners operate in a split spool
> environment. I have added it to the Wish List.


I haven't looked at the code but its probably relatively easy. If
no-one else gets to it (and if I remember), I'll try and see if I can
do it along with a bunch of other mods I'd like to do locally.

> > Alternatively you could have an option to split the queue up in the
> > queue runner itself. It does the normal directory scan, but after <n>
> > messages have been read, it forks off a sub-process to handle those
> > <n> messages, and carries on reading.


> ... but then you end up with <m> queue-runners instead of one, where <m>
> depends on the length of your queue and isn't controllable. It breaks
> the current "one queue-runner delivers one message at a time" semantics.


I'm reading an implied ``to a given remote destination'' here? If so,
then the semantics are sort of bent anyway, as there is no interlock
between queue runners, and various race conditions can cause that to
occur anyway, unless I'm grossly mistaken. However, that is not the
`normal' way that things should happen, yes.

Anyone ever toyed with threading exim? I know half of you just
spilled your coffee on your keyboard, but it would solve a lot of
these interlock and race issues (on platforms that supported
pthreads). I don't think threading the entire program would be
necessary, just enough of the queue runner to make a single
process/multi-threaded application. It would solve the coherency
issues that you avoid currently by only using the .db files as hints.

(ok, well, maybe one good idea a month is my quota. I'll be quiet now
:) )