Re: [exim] [OT] an automated spam filtering technique

Etusivu
Poista viesti
Vastaa
Lähettäjä: Marc Perkel
Päiväys:  
Vastaanottaja: exim-users
Aihe: Re: [exim] [OT] an automated spam filtering technique
I don't think it's off topic at all. That's very interesting. I'm going
to experiment with your list. If you have a longer list I'd be
interested in that too.

Stanislaw Halik wrote:
> Hello,
>
> Although the subject is offtopic to Exim itself, I've decided to post
> it, knowing that many inspired spam fighters read the list.
>
> Many spam messages are distributed in large quantities idempotent and
> even if not, some expressions are contained in many of them.
>
> I've just started publishing my spamtraps and I got around 160
> text/plain parts from them.
>
> I searched for 5-word sentences that are included multiple times. Here's
> what I got:
>
>   11    view available updated software from
>   11    new software for you click
>   11    some new software for you
>   11    you click here to view
>   11    here to view available updated
>   11    for you click here to
>   11    to view available updated software
>   11    software for you click here
>   11    has uploaded some new software
>   11    uploaded some new software for
>   11    click here to view available
>   10    in symbols for i386 i386
>   10    number lotto ball number lotto
>   10    reading in symbols for i386
>   10    ball number lotto ball number
>   10    lotto ball number lotto ball

>
> I run my spamtrap mail through procmail, then it is sent to mimedump
> which extracts text/plain parts to it. I might create an automated
> script which creates SpamAssassin rulesets from popular spam phrases,
> then update SA configs the same way I do for other rulesets. Fully
> automated, not needing any human intervention whatsoever.
>
> Any comments or criticism? Only vulnerability I can think of is spammers
> poisoning the results via non-spammy phrases, but I don't think they
> would all do so because of a method created by one insignificant spam
> fighter. Nevertheless, I wouldn't assign extremely high SA scores, as
> it's a fully automated process.
>
>