[exim] Need help clarifying Exim performance changes I am su…

Top Page
Delete this message
Reply to this message
Author: Justin Fossey
Date:  
To: exim-users
Subject: [exim] Need help clarifying Exim performance changes I am suggesting to dedicated server host to prevent load spikes
Hi

This is a rather long support request, for that I am sorry. Any help or clarification would be help
full.

Background:
-----------

I am currently in a discussion with my managed dedicated server service provider about making
performance related changes to the Exim setup.

I have very limited access to the server and can't see all the config files so what I have learnt is
through trail and error and with the assistance of the support department of my hosting company.

As I am just a client to keep things clean I will not be naming the hosting company. I have a good
relationship with them and I would like to keep it that way.


The Problem:
------------

The problem we have been having is that our server has these major spikes a few times a day where
load goes through the roof and other services IE: LAMP struggle and start to timeout.

Their dedicate server setup uses Puppet to manage all their managed dedicated servers and us as the
clients are allowed to request limited changes through the uses of extra customization settings
files that is not managed by Puppet that overrides or extends the current settings.

I have successfully made lots of customizations to PHP and mySQL using these same customizations
files.

All servers run Debian and I think the default Debian Exim config files. If there are
changes/customization of the default packaged Debian Exim config files my understanding is that they
are minor.


My suggested changed:
---------------------

Based on my limited investigation I have found the load spikes seem to coincide with a spike in the
number of Exim process. Normally there would be between 1-5 Exim process running and load will be
low. Then the load spikes there would be between 10-50 Exim processes and then the load would jump
to 10-90. This would last for a few min at most 5-8min and then drop quickly and load and
responsiveness would normalize.

My requested changes based on my reading of the Docs are the following.

remote_max_parallel = 1
queue_smtp_domains = *

My understanding is that this should reduce the number of process started for remote deliveries and
should mitigate these spikes by preventing immediate delivery for remote SMTP hosts.

Maybe later if needed, I might consider reducing queue_run_max.


Support Admin feedback:
-----------------------

The problem is when I requested to add the queue_smtp_domains setting to the customizations file I
got the following response from the support admin's.

> When trying to add the last customisation, Exim4 failed to restart the process saying that
> 'queue_smtp_domains' as already been defined in the main config.
>
> The current setting is:
> queue_smtp_domains = +etrn_domains


After additional investigation I have sort of worked out but can't be 100% sure that they have some
of the following settings set in the main config file.

domainlist etrn_domains = lsearch;/etc/exim/domains.etrn
domainlist our_domains = +local_domains :\
                         +virtual_domains :\
                         +vmail_domains :\
                         +uucp_domains :\
                         +smart_route_domains :\
                         +etrn_domains


queue_smtp_domains = +etrn_domains
queue_domains = +etrn_domains
hold_domains = +etrn_domains

My understanding is that the setting hold_domains is not compatible with queue_domains or
queue_smtp_domains. So I can't request for "*" to be added to the domains.etrn file as this would
affect other settings as well.

I later also got the following response from them.

> You are correct, domains.etrn file that will get checked for additional settings.
>
> However this file is always empty unless you specify differently. This empty file acts the same
> as the '*' according to our administrators.
>
> 'hold_domains' is also set to this empty file and will have no impact on mail delivery at the
> moment.


I believe the second line of the above response to be wrong or some sort of misunderstanding between
me, support and the admin's.


My Questions:
-------------

So my question is what I can propose they change. A change to a Puppet managed config file is
something that requires major approval and to be rolled out to all their servers for consistency.

Am I able to maybe set the request change in the customizations file in a way to overrides or
extends this setting and get Exim to successfully restart without changing the main config file that
is managed by Puppet?

I believe their setup is incorrect and you would never set queue_smtp_domains and queue_domains and
hold_domains to the same domainlist file. Is my understanding wrong or right?

If this is how the default Debian package is configured and if it is technically incorrect I am a
little shocked.


Any help or clarification would be greatly appreciated.

Justin