Re: [exim] Reducing Spam Assassin Load

Αρχική Σελίδα
Delete this message
Reply to this message
Συντάκτης: Peter Bowyer
Ημερομηνία:  
Προς: exim-users
Αντικείμενο: Re: [exim] Reducing Spam Assassin Load
On 02/10/05, Marc Perkel <marc@???> wrote:


> Peter - where you are wrong is that SA has VERY high overhead.


Where I am wrong about what? I don't think I said anything to the
contrary. You're not reading what I'm saying.


> It's a perl
> program running thousand of regular expressions, network lookups, bayesian
> comparison, and learning.


Thanks - the eggsucking will go much better now. From your loving Grandma.

> I love SA but avoiding it is a huge increase in
> performance. Especially if the avoinding it is simple.


And my point remains that the avoiding of SA where that means not
letting it see the ham will stop the Bayesian bit working. If that's
not a problem for you, then fine - but don't tell me I'm wrong about
SA when what you mean is that you don't see this downside as a problem
in your circumstances.

>
> People use things like greylisting and such to avoid using SA to reduce load
> processing spam. But no one has done anything to reduce load processing ham.
> Every good message runs through SA and I think there's a way to bless some
> of that especially where the same message is coming in for lots of users.
> This would really cut don't peak loads.


No one has done this? I think you mean that *you* haven't. Please try
and resist the temptation to speak for everyine else.


> I like the idea of accessing the AWL of SA possibly to see if I can find
> ways to bypass SA based on reputation. What would be ideal is to do some
> fort of hints database with a short term reputation that expires in hours.


The problem, as always, is definining 'reputation'. Should be easy
enough (as I suggested before) to update a database with
greylisting-style data after a message passes SA, and check this db in
an early acl for the next message, but is this enough of an indication
of reputation to risk bypassing the very thing you've implemented for
the purpose of scoring a message in real time? (I'm not saying it
isn't, I'd want to see stats though)

I guess it's all in the timing - you need the whitelist window to be
long enough to continue letting in the good stuff, with a re-scan
every now and then to keep the db entry fresh, but not so long as to
let in spam which disguises itself with the same fingerprint, which is
not uncommon given the habits of zombies.

Interesting stuff, though.

Peter


--
Peter Bowyer
Email: peter@???
Tel: +44 1296 768003
VoIP: sip:peter@???