[exim] Extending greylisting

Góra strony
Delete this message
Reply to this message
Autor: Alun
Data:  
Dla: exim-users
Temat: [exim] Extending greylisting
Hi all,

I've had an idea that could make greylisting more useful
in the presence of spammers that retry. I thought I'd publicise
it somewhere to see what people think.

With greylisting, we cause a temporary deferral for new
(sender/recepient) pairs (or sender/IP/recipient triples). This
deferral times out after a period (I use 1 hour) and, if the
pair or triple is retried we accept the message (or at least
pass it on to other anti-spam checks).

Spammers are starting to use retries to get around the greylisting
system - my inbox bears testimony to this.

Anyway... thinking about what a spammer needs to do gave me an
idea. I'm describing this in isolation of any existing
greylisting system - it's entirely possible to integrate it into
existing greylist code, but it's easier (for testing) to keep it
separate.

For RCPT commands, record in a database the sender, recipient, IP,
time first seen and status. Do this for all valid and invalid senders
and recipients. "status" is "GOOD" or "BAD", for existing or
non-existing local recipient addresses respectively. If you want
to be cute, you could also record "BAD" for non-existant senders,
using whatever sender verification procedures you care to.

Now, when a host retries, you can query its (attempted) submission
history to get an idea of its intentions.

The general idea is that a spammer, encountering a greylist and retrying
needs to have lots of addresses outstanding in the deferred state at your
host - if they just submit one mail at a time and retry each mail before
moving to the next, they'll not get much throughput (24 mails per day at
my site).

However, if they put a lot of messages into the "grey" state you can get an
idea of their intentions before they ever submit any messages to you.

If their list of addressess isn't very accurate, they'll show up with lots
of "BAD" submissions during the greylist period, so the ratio of
(count(BAD)/(count(GOOD)+count(BAD))) will be high.

If their list is accurate, but you've not seen the IP before,
count(*)/(now-min(seen time for sender/recipient pairs from this IP))
might be high.

In fact, any host with a high count of new addresses in the past
few minutes may well be suspect.

I imagine there are various other inferences that can be drawn from
the data. At the moment I've got a perl module recording information,
and will experiment over the next few days.

Looking at the overall stats, I can see some real candidates for
IP blacklisting. 82.165.237.16 submitted 323 bad addresses and
132 good addresses over the course of a day recently. It's not
blacklisted at MAPS or SPAMHAUS and its mails were largely not
spotted by SpamAssassin here (scored 4.1). In its first hour,
however, it submitted 7 bad addresses and 2 good ones. If I'd
penalised it with 2 SpamAssassin points as a result, it would
have exceeded my threshold for blocking and, looking at its
overall performance I'd be quite happy to put it in a local IP
blacklist. I should mention that, looking at one of these messages,
it was a phishing attack.

Any thoughts, anyone?

Cheers,
Alun.

-- 
Alun Jones                       auj@???
Systems Support,                 (01970) 62 2494
Information Services,
University of Wales, Aberystwyth