Re: [exim] Caching Spammers to Speed Processing

Top Page
Delete this message
Reply to this message
Author: Alun
Date:  
To: exim-users
Subject: Re: [exim] Caching Spammers to Speed Processing
Marc Perkel (marc@???) said, in message
    <42112CC2.7010105@???>:

>
> Just wondering if anyone else is doing this trick. Here's what I'm
> experimenting with.


We've been using this sort of "feedback" for several years, to great
effect. Here's what we do:

1) If an IP has generated any detected spam to us in the past hour, we add
a 20 point penalty in the calculation of our tarpit delay. For example,
the following log shows a spammer hitting us this morning:

     ...
     2005-02-15 11:35:45 199.181.134.40 23 1.3 
     2005-02-15 11:35:45 199.181.134.40 24 1.4 
     2005-02-15 11:37:21 199.181.134.40 22 49.8 Spammer
     ...


The 5th column is the number of people mailed from this IP in the last
hour. The 6th is how long the tarpit delayed the RCPT TO:. The 7th is
extra information the tarpit is using to calculate its delay. This
spammer had hit 24 local addresses before our spam scanning spotted them.
The tarpit then used this information to up the per-recipient delay from
1.4 seconds to 49.8.

2) The other thing we do is use spotted spam as a pattern for more
aggressive filtering. We don't use it to defer or block connections,
merely to improve the spam filtering that we do. When we match a spam,
we drop some details about it into a database along with a timestamp. We
then compare future mails with entries in this database. Each time a
filter matches, this refreshes that particular filter, and a filter only
gets deleted after it's not matched anything for 3 hours.

So, for example, our current dynamic filters contain (amongst about 200
others) the following entry:

     last_used = 2005-02-15 09:12:54
     sender_host_address = 206.81.116.19
     h_from = financialadvice <financialadvicetoday@???
     h_subject = the financial advice you've been looking for.


Any mail coming in from the same IP with either a similar From: or
Subject: line will be marked as spam. Any mail coming in from *anywhere*
with both a similar From: line *and* Subject: line will be marked as
spam.

This filtering method identifies about 48% of all spam we mark. Around
10% of all spam matched by this filter isn't matched by any of our other
filters (including sbl,spamcop,mail-abuse.org,bayesian,spamassassin).

Cheers,
Alun.

-- 
Alun Jones                       auj@???
Systems Support,                 (01970) 62 2494
Information Services,
University of Wales, Aberystwyth