On Thu, 30 Dec 1999, Theo Schlossnagle wrote:
> I have a problem... this is a tough one perhaps...
Yes.
> I am running exim in a production set up and we are sending out around 6
> million emails (unique and individually addressed) per day. This has
> worked fine until now. We are slightly overloading the systems now and
> they can't keep up. If we get a spike in the flow of emails, the queue
> size jumps up to 200,000 messages on the machine that saw the spike.
Exim is not designed for handling large queues.
> Exim takes O(n) time to start where n is the size of the queue.
> (directly proportional). It appears to be reading the entire queue (not
> contents) but directories and msglog information.
It is not reading msglog information. It never reads that. However, a
queue runner reads the directories in order to get a list of all the
messages. It keeps the list in main memory, and works its way through
it. The idea was that it wouldn't keep the directory open for long
periods.
> The way I would fix it is to map a decent sized sahred memory (sysV)
> segment and keep the info found there, so as long as there is at least
> on exim process running, the spool directory doesn't need to be
> scanned. I would wager this would be a lot of hacking.
Complete re-design. Exim does not have a central control process. It
does not have a list of messages - the list is the contents of the spool
directories. The lack of a central process is actually one of the
reasons Exim scales fairly well - other MTAs that have a single "queue
manager" process have, I am told, found this to be a bottleneck.
However, it does mean that a queue-runner process has to scan the
file system to get a list of messages.
> Please tell me I missed something in the docs and I can sa something
> like
>
> full_spool_scan = false ;)
I wish I could!
--
Philip Hazel University of Cambridge Computing Service,
ph10@??? Cambridge, England. Phone: +44 1223 334714.