Re: [Exim] Persistent DB connections to MySQL database

Top Page
Delete this message
Reply to this message
Author: Philip Hazel
Date:  
To: Mark Moseley
CC: exim-users
Subject: Re: [Exim] Persistent DB connections to MySQL database
On Wed, 15 May 2002, Mark Moseley wrote:

> First let me say that Exim is the greatest MTA ever created...


Thank you. When I read a paragraph like that, I know that the next one
is going to be something like

> That said, I'd like to make a feature request,


Yup... :-)

> Persistent DB connections, just like Apache::DBI in mod_perl


Very difficult because of the way Exim works. It doesn't have any
persistent processes (not counting the daemon). This is a deliberate
design, to avoid bottlenecks in mail delivery. Independent processes do
not interfere with each other, and can run well on multi-processor
hosts. [Exim was originally designed without any thought of using
databases. That all came along later.]

> We make very heavy use of MySQL in our mail setup (and I don't see how
> it could be feasible to do large-scale mail operations without
> connecting to a DB or LDAP backend).


For the very fastest performance, I would think you would do better to
abstract the relevant data from your DB or LDAP, and build local cdb
files for Exim to work off. Of course, this gives a delay between making
a database change and its taking effect, but you can arrange for that to
be relatively short. It also means that mail continues to flow when the
database is down.

> The result is that we have upwards of 20-30 MySQL connections open at
> any time. Our DB can handle the load pretty well, but if the DB
> connections were persistent, I imagine that we could squeeze out some
> pretty nice extra performance. Just seems like a lot of overhead to
> create and tear down so many TCP sessions. Having a pool of DB
> connections cached to be reused would be much more efficient.


It would be very interesting to know if those assumptions are actually
true in practice. To do this, you would have to create a permanently
running process (or several) that maintained the connections. Then you'd
have to implement an Exim lookup module that made a connection to that
process. It wouldn't be TCP setup, but it would be at least Unix domain
connection set up. If you have only one process that every Exim process
has to communicate with, it might cause delays because of queueing to
get at the services of that process. I don't say it *will*, I'm just
speculating. I'm no expert on this kind of thing.

Another disadvantage, of course, is that if your central DB caching
process goes wrong, or crashes, you've killed all mail delivery. With
the current Exim design, if one process has problems, it does not affect
the rest. The daemon is the only process that is vulnerable in this
sense, and even then, it affects only incoming messages. As its job is
well defined and doesn't change, the daemon code isn't edited very
often.

I just don't like putting all the eggs in one basket. Or even a small
number of baskets.

> And heck, while I'm day-dreaming, any chance of being able to tune the
> size of the cache?


The cache is only per-process. It is not inter-process.

Consider again my suggestion above about abstracting relevant data from
your DB and sticking it in cdb files. You could write a permanently
running process that does this every 30 minutes or whatever (or when
prodded). This could keep open a permanent connection to the database.
You could even arrange for it to watch a field in the database that told
it when to rebuild which files. If you do this, the cdb files are your
inter-process cache, and on a busy system their blocks will be in main
memory most of the time - let the OS do flexible caching for you,
according to the amount of free memory. The interlocking between Exim
processes is then only at the filesystem level, and the OS is good at
handling that kind of contention.

Philip

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.