Re: [Exim] Most efficient spam rejection with Bogofilter

Page principale
Supprimer ce message
Répondre à ce message
Auteur: Exim Users Mailing List
Date:  
À: matt
CC: Exim Users Mailing List
Sujet: Re: [Exim] Most efficient spam rejection with Bogofilter
[ On Tuesday, December 2, 2003 at 18:28:15 (-0000), Matt Sealey wrote: ]
> Subject: [Exim] Most efficient spam rejection with Bogofilter
>
> Just wondering what the most efficient/successful way is to
> reject Spam once it has been evaluated by Bogofilter?


If you're thinking of doing that in your mail server for all of your
site's e-mail, then that's not a very good idea.

Bogofilter and its brethren are best used directly by individual users
with an interface directly in their MTA.

The first, and perhaps most important, problem is that unless all your
local users have almost identical mail usage patterns such
self-adjusting statistically based token analysis filtering techniques
will quickly degrade into uselessness. (of course if you're the only
e-mail user on your system then that's a slightly different story :-)

The second problem is that bogofilter requires regular re-training. It
can get quite counter-productive and will rapidly get "confused" if you
don't regularly re-classify all the messages it gets wrong, and it
_will_ get some wrong (especially since by default it errs heavily on
the side of caution). That's why all the mail clients with something
like it built in (Mozilla, Apple Mail, etc.) have a very easy to use
"junk/un-junk" button for re-classifying any mis-identified messages.

I've been using bogofilter for a while now (finding it better than bmf).
Initially I trained it with my personal mail archives, and it still very
often miss-identifies all the 419-fraud scams as non-spam, sometimes
even giving them a spam rating of less than 0.5, perhaps though because
there were relatively few of those in my archived spam folders!
(btw, exim-users mail gets a consistent spamicity=0.000000 :-)

(I have considered filtering my outgoing mail through it too, with a
forced "ham" declaration of course, mostly just for fun, but maybe that
would help train it a bit better as time goes on so it catches more
scammer mail too.)

--
                        Greg A. Woods


+1 416 218-0098                  VE3TCP            RoboHack <woods@???>
Planix, Inc. <woods@???>          Secrets of the Weird <woods@???>