Re: [exim] SMTP processing hangs during bayes sync

Top Page
Delete this message
Reply to this message
Author: W B Hacker
Date:  
To: exim users
Subject: Re: [exim] SMTP processing hangs during bayes sync
Bodo Gelbe wrote:

> I've the following problem with exim and exiscan:
>
> Exim calls spamd in the DATA ACL (for messages from
> external hosts).
>
> spamassassin/spamd is configured with
>
> bayes_auto_learn 1
> bayes_journal_max_size 0
> bayes_auto_expire 0
> bayes_learn_to_journal 1
>
>>From time to time (cronjob) the database and the journal are
> synchronized (sa-learn --sync). During the synchronization
> process SMTP requests are hanging for 20 to 50 seconds (even
> those from local hosts, which will not be scanned by spamd).
> Number of exim processes are increasing.


Does 'exiwhat' confirm that local traffic is in the pile of slowed-down
processes? Does it tell you anyhting else that might be useful?

What is CPU/memory load at the time?

>
> Looks like something is locked within exim before spamd
> is called, which locks out processing for messages not
> undergoing spamd processing.
>


May be that those that *are* awaiting SA have used up enough resources, limits,
that the others are simply awaiting earlier processes to finish?

Are your various limits all at default, or?

> Any idea how to avoid the delay (dont't want a transport
> because we're doing greylisting based upon the spam level)?
>
> kr, Bodo
>


Changing the scheduled sync to low-load time-of-day, or providing greater
resources aside, I'll suggest a contrarian approach:

- Check and see how many times has the Bayes point contribution to the total SA
score been sufficiently high to 'tip the balance' when a given message was not
*already* firmly tagged as Spam by other rules, OR deniable for protocol,
invalid recipient/domain, format, attachment, MIME, viral reasons?

Hint: In the as-issued SA, most Bayes scores are so miniscule they rarely
contribute enough to matter, yet Bayes consumes significant resources.

Worse - Bayes on a server where user traffic can be so very different, one from
another, seems to get confused far more easily than when run on an individual
user's MUA. (pays its way on Mozilla/Thunderbrd and others...)

Server-side, we've shut Bayes off entirely with more beneficial results than not.

Greylisting, BTW, is likely to more than double your connection load, (retry may
be idioticlly rapid for zombies) - spawning child processes that may not go all
the way through to the DATA phase, but will certainly consume resources.

Might help to run 'queue_only', adjust your queue runner interval 'til the
resources balance better.

YMMV,

Bill