Re: [exim] Finding system bottlenecks to speed up Exim

Top Page
Delete this message
Reply to this message
Author: W B Hacker
Date:  
To: exim users
Subject: Re: [exim] Finding system bottlenecks to speed up Exim
Marc Perkel wrote:
> I'm trying to figure out where the system bottlenecks are to speed up my
> main Exim server. As many of you know I do front end spam filtering.
> Email comes in, I clean it, and the good email goes off to the
> customer's server.
>
> I just took on a new customer who has an incredible amount of spam
> coming and and it's a keeping up but taxing the server and I'm trying to
> figure out where the bottleneck might be. The spam bots are hitting it
> between 10-30 times a second.


I thot you had an IP-based blacklisting / auto firewalling toolset of
your own?

With that sort of bot-load, I'd probably already have locally
blacklisted several rather large blocks.

As we do for major networks in about a dozen jurisdictions (with VIP /
whitelisting as well, of course).

>
> I'm not running spam assassin or anything else but Exim on this server.
> It is also configured to try to deliver good email only once and if it
> fails it transfers the message to another server that does the retries.
> So the message queues are not big.
>
> The server is a dual core AMD running at 3ghz and has 8 gigs of ram.
> Running 64bit Fedora 8.
>
> The connection count floats from 1000 to 2500 connections. At 1200
> connections load level is about 20. The server still processes email
> fairly fast at load levels of 100.


Be happy. That load level sounds like more of a brag than a complaint...

> But I'd like to get the levels down
> if possible. I suspect the high connection count is somewhat related to
> spam bots failing to close connections and waiting till it times out.
>


We've normally seen the *reverse*, hence use short delays effectively.

> I'm running the old xosview program to watch various loads. What I'm
> seeing is that the CPU is not pegged at 100% all the time. Most of the
> time it's below 100%. Xosview shows most CPU time is being used by the
> system (kernel?) as opposed to buy usr (applications?).
>
> Disk IO is not real heavy. It has a reasonably fast SATA II drive that's
> not very full. Writes about 3 gigs of log file entries a day.
>
> What I'm thinking is the slow part is the TCP stack. That managing the
> number of connections is what is causing the high load. If what I
> suspect is true then are there any tricks to make TCP go faster?
>
> Or - what tools or tricks can I use to see where the bottleneck is?
>
> Thanks in advance.
>
>


Given that you are not running SA et al, I can't imagine getting more
than a ten-percent improvement if you optimized everything within reach
- not and still have all the functionality and stability.

Sounds like cause for a separate box for this sort of load.

Bill