Re: [exim] exiscan, spamassassin, and per-domain bayes datab…

Top Page
Delete this message
Reply to this message
Author: Marc Sherman
Date:  
To: Steve Lamb
CC: Exim Users List
Subject: Re: [exim] exiscan, spamassassin, and per-domain bayes database
Steve Lamb wrote:
>
> That case is where one's person's interests (and thus what they
> consider ham) coincides with the spam that another person (or all
> people since spam is pretty universal) gets. So which of these
> fields regularly has spam which would use it's keywords: Programmer
> Web Designer Audiologist Landscape Architect
>
> To be honest I just can't remember the last time I got spam offering
> me hot Programmer tips, how to get the most outta my stereos, whether
> astroturf or natural was better. The closest I can come with the
> above list is that there are times I get spam for OEM software which
> includes Dreamweaver. So if your Web Designer uses Dreamweaver you
> might have one term in there that gets wonky.


Yeah, OEM software spam was what I was thinking of. I get a ton of that
in my quarantine folder (and much more rejected outright). Yet I've
never gotten a false negative or a false positive on a message
discussing such software that I could attribute to a wonky bayes score.

FYI, your message was scored as:
   1.0 SARE_OEM_SOFT_IS     BODY: Software that is OEM
   -4.0 BAYES_00            BODY: Bayesian spam probability is 0 to 1%


On the other hand, a random actual OEM spam from my rejectlog scored
(among many other spamassin rules hits):
   4.0 BAYES_99             BODY: Bayesian spam probability is 99 to 100%
   0.8 SARE_OEM_PRODS_1     SARE_OEM_PRODS_1
   0.9 SARE_OEM_PRODS_FEW   SARE_OEM_PRODS_FEW
   0.9 SARE_PRODUCTS_03     SARE_PRODUCTS_03
   0.4 SARE_PRODUCTS_02     SARE_PRODUCTS_02


So my bayes db seems to do just fine, in spite of the confusion about
OEM software. Sure, it's weakened on the actual software keywords, but
there appear to be more than enough other recognizable tokens to
differentiate the spam from the ham that it doesn't matter.

- Marc