Re: [exim] Speeding Up Exim

Top Page
Delete this message
Reply to this message
Author: Matt
Date:  
To: Exim Mailing List
Subject: Re: [exim] Speeding Up Exim
> Actually, so far you failed to state a problem.
> Load is something like "number of processes waiting to be processed" and
> I have seen servers with load-values higher thann 1000 which still
> reacted faster than my laptop.
>
> Load is not a problem. It might indicate a problem, but load itself is
> only a symptom.
>
> Disk-IO might be an explanation for "high" load-values. There are
> countless others. Find the real one before desperatly trying out
> possible solutions... when I read your questions and the first answers
> which all circled around "optimising disk-io by tuning the kernel" I
> just felt desperation. Nobody even considered wheater or not your
> asumptions sound reasonable. Nobody even asked questions....
>
> First, 2000 email-accounts does not sound like a big deal, the number of
> messages does. But of course, that's just 20 messages/mailbox. You
> mention one harddisk and that you can't afford a long downtime. That
> worries me. You can't provide reliable service without decent storage
> (raid 1 or raid5, and backup, of course) and regular maintenance. Fix
> that.
>
> In order to understand the real problem, you need data about your
> system. For example, recent top (I prefer atop) with recent kernel will
> show a value "wa" which is short for waiting. If you really have a
> problem regarding disk-io, "wa" should show this.
> Cpu(s): 1.4%us, 0.5%sy, 0.0%ni, 98.1%id, 0.0%wa, 0.0%hi, 0.0%si,
> 0.0%st


Since doing the Spamassassin "bayes_learn_to_journal 1" bayes tweak I
have not seen any serious load issues. Today is closest I have seen
and its not near as bad as I have seen in past. Here is portion of
top. Previously load average would break 100 at peak times stay there
quite a while. This peak of ~33 did not last long either. Quickly
dropped back below 10 and now below 5.

top - 11:27:13 up 1 day, 19:27,  1 user,  load average: 31.33, 33.04, 18.19
Tasks: 251 total,   1 running, 241 sleeping,   0 stopped,   9 zombie
Cpu(s): 10.1% us,  3.3% sy,  0.0% ni, 25.4% id, 61.0% wa,  0.2% hi,  0.0% si
Mem:   4115344k total,  3785164k used,   330180k free,   267060k buffers
Swap:  2031608k total,        0k used,  2031608k free,  2203720k cached


  PID USER      PR  NI %CPU    TIME+  %MEM  VIRT  RES  SHR S COMMAND
  346 nvcs      15   0    4   0:47.58  1.3 60580  51m 2964 S spamd
  347 root      16   0    2   0:43.70  1.2 58764  49m 2952 S spamd
 3536 clamav    16   0    1  21:21.45  2.6  176m 104m 1052 S clamd
 4484 named     18   0    1  24:56.85  1.4  103m  57m 1936 S named
15223 nvcs      17   0    1   0:00.02  0.1  8376 3632 2020 S pyzor
  493 root      15   0    0   3:28.21  0.0     0    0    0 D kjournald
 3411 root      16   0    0   0:34.79  0.0  2392  492  360 S dovecot
15118 bbwi      18   0    0   0:00.06  0.0  8340 1148  700 D exim
15229 nvcs      18   0    0   0:00.01  0.0  7296 1152  700 D exim
28803 mail      15   0    0   0:40.12  0.0  7820 1236  852 S exim
28860 root      15   0    0   2:04.87  1.0 47192  38m 3252 S spamd
    1 root      16   0    0   0:04.90  0.0  2860  548  468 S init
    2 root      RT   0    0   0:00.54  0.0     0    0    0 S migration/0
    3 root      34  19    0   0:06.35  0.0     0    0    0 S ksoftirqd/0
    4 root      RT   0    0   0:00.40  0.0     0    0    0 S migration/1
    5 root      34  19    0   0:05.23  0.0     0    0    0 S ksoftirqd/1
    6 root       5 -10    0   0:00.19  0.0     0    0    0 S events/0
    7 root       5 -10    0   0:00.08  0.0     0    0    0 S events/1
    8 root      11 -10    0   0:00.00  0.0     0    0    0 S khelper
    9 root      15 -10    0   0:00.00  0.0     0    0    0 S kacpid
   43 root       5 -10    0   0:00.00  0.0     0    0    0 S kblockd/0
   44 root       5 -10    0   0:00.00  0.0     0    0    0 S kblockd/1
   45 root      15   0    0   0:00.00  0.0     0    0    0 S khubd
   62 root      15   0    0   0:16.38  0.0     0    0    0 S pdflush



> I highly recommend running "munin" on every exim-server. It will gather
> lots of numbers regarding your server, which can be invaluable when
> facing problems.
>
> In my experience the most likely cause for high load-values on
> exim-serves is DNS-related. If you process 40k messages an hour and use
> SpamAssassin, more than 500k DNS-requests are likely. If you didn't
> worry about a local caching-DNS-daemon on your mailserver, than you
> should do that now. If you already have a local caching-DNS-daemon on
> your mailserver, consider moving it to a different server.


Running bind on server and from I have heard bind does not create a
significant system load.

> Any further advice would be wild guessing, so it's up to you to provide
> further data.


I still think this is a disk I/O issue but am no expert. Perhaps
there are more tweaks I can do to reduce disk I/O.

Do entries like this in exim.conf create more disk I/O?

# deny email addresses listed in file
deny recipients = lsearch;/etc/virtual/blocked_email
      message = Email account suspended due to inactivity


I use that script to auto suspend email accounts that have not been
used/checked in over 6 months. There are a number of other similiar
entries used for things such as popb4smtp. Not sure how efficiently
entries like that work.

Matt