[exim] SPAM Filtering - Losing the war!

Author: Vitaly A Zakharov
Date:
To: exim-users
Subject: [exim] SPAM Filtering - Losing the war!

In my opinion, each postmaster (except Postgres daemon :-) MUST prevent
"false positives" as much, as he can. Many of mail systems is ruled by
incompetent systems administrators and it is great likelihood that those
hosts will be blocked by spam filter, that related to wrong
configuration of this mailers (or configuratoin of DNS system, if we
speaks about DNS checks).

But then, there is a "public" mail systems (as free mail services, or
mail servers of hosting providers). A lot of them is rightly configured.
However, whose hosts may be listed in DNSBL's (very often in Spamhaus,
IMCO), because company client (or user of free mail service) makes a
spam distribution. We make mistake rejecting mail from those systems,
bacause administrators quickly discover and stops spam distributions
like that, and the most part of mail from that systems is "legitimate".

Instead of starting "holy wars" flames, we SHOULD understand, how to
decrease the volume of false positive rejects in spam checks of mail
systems, without decreasing the volume of true positive mail drops?
A lot of spammers hosts has more sins, besides listing in SBL's, or
giving "HELO friend". :-)

So that is the simple example for MTA Exim in which we can see, how to
block spammer hosts more efficiently, and making Exim more efficient worker:

First, we need three markers (because this simple conf has three
checks). Each marker has boolean (1 or 0) value, which contains the
result of some spamcheck.

   warn  set acl_m0 = 0
         set acl_m1 = 0
         set acl_m2 = 0

Each sender MAY go through the all of spamtests, and have some labeles
on finish. Examine this labeles at the finish we assessing the
personality of this host and reject or accept the message.

We make the SBL listing test and remember result in variable acl_m0

   warn    dnslists       = sbl.spamhaus.org: \
                            bl.spamcop.net: \
                            relays.mail-abuse.org
           set acl_m0     = 1
           set acl_c0     = $acl_c0 Listed in DNSBL $dnslist_domain;

Also, we include result in status string variable acl_c0. We will need
it later. Operand "warn" means that is no action performed, only
passive check.

This is second spamtest in our example, test of coincidence of PTR
(reverse) and A (direct) DNS records using Exim stored procedure
"reverse_host_lookup":

   warn    !verify        = reverse_host_lookup
           set acl_m1     = 1
           set acl_c0     = $acl_c0 Reverse host lookup failed;

The result is remembered in acl_m1.

The last spamtest is the check of HELO command argument, the result of
which we put in acl_m2.

   warn    !condition     = ${if or {\
                 {eq{$sender_helo_name}{$sender_host_name}}\
                 {match\{$sender_helo_name}{\\[$sender_host_address\\]}}\
                                    }}
           set acl_m2     = 1
           set acl_c0     = $acl_c0 HELO forged;

HELO command argumend should be an FQDN (Full Qualified Domain Name) or
IP literal (IP-address in brackets, like [1.2.3.4]) and points to this
host (this is summarily, see RFC 2821 for full details).

Now we have three results and should consider about action, taken to
that SMTP session - abort it, accept message, or gave defer.

This section describes how to handle the message. If two (or all three)
variables is true, session reset, else - the message is accept.

   deny  condition = ${if and{\
                      {eq{$acl_m0}{1}}\
                      {eq{$acl_m1}{1}}\
                      }{1}{0}}
         message      = $acl_c0

   deny  condition = ${if or {\
                      {eq{$acl_m0}{1}}\
                      {eq{$acl_m2}{1}}\
                      }{1}{0}}
         message   = $acl_c0

   deny  condition = ${if or {\
                      {eq{$acl_m1}{1}}\
                      {eq{$acl_m2}{1}}\
                      }{1}{0}}
         message   = $acl_c0

accept

Note, if mailer is rejecting message, it MUST include the description of
reason of mail reject in message, transmitted with 5xx status code. This
helps postmasters to resolve problems in case of "false positive".

Exim give implementation to detect many traces of spam and can handle
them very good.

This is an example of more progressive config, that provides an
"spamweight" of the session by counting the "spamscore". It based on
supposition that SBL listing is more significant than lack of
convergence in DNS record and still more important than forged HELO
arguments:

   warn      set acl_m0     = 0

   warn    dnslists       = sbl.spamhaus.org: \
                            bl.spamcop.net: \
                            relays.mail-abuse.org
           set acl_m0     = ${eval:$acl_m0+30}
           set acl_c0     = $acl_c0 Listed in DNSBL $dnslist_domain;

   warn    !verify        = reverse_host_lookup
           set acl_m0     = ${eval:$acl_m0+10}
           set acl_c0     = $acl_c0 Reverse host lookup failed;

   warn    !condition     = ${if or {\
                 {eq{$sender_helo_name}{$sender_host_name}}\
                 {match\{$sender_helo_name}{\\[$sender_host_address\\]}}\
                                    }}
           set acl_m0     = ${eval:$acl_m0+20}
           set acl_c0     = $acl_c0 HELO forged;

To bring cleaness in understand of process and make it more intresting,
I added a block of sender name validation, using internal exim procedure
"callout":

   warn    !verify        = sender/callout=1m,defer_ok
           set acl_m0     = ${eval:$acl_m0+20}
           set acl_c0     = $acl_c0 Cannot complete sender verify;

And now we reject messages, spamscore of which equals or greater then 40:

   deny    condition = ${if <={$acl_c0}{${eval:40}}}
           message   = $acl_c0

Summary, we can handle more situations to reject the message:

1.    sender is listed in sbl
    PTR and A DNS records does not not coincide
    HELO argument is forged
    sender name is forged

2.    sender is listed in sbl
    PTR and A DNS records does not not coincide
    HELO argument is forged

3.    sender is listed in sbl
    HELO argument is forged
    sender name is forged

4.    sender is listed in sbl
    PTR and A DNS records does not not coincide
    sender name is forged

5.    sender is listed in sbl
    PTR and A DNS records does not not coincide

6.    sender is listed in sbl
    HELO argument is forged

7.    sender is listed in sbl
    sender name is forged

8.    PTR and A DNS records does not not coincide
    HELO argument is forged
    sender name is forged

9.    HELO argument is forged
    sender name is forged

Uff. Seems that all possible (except defers) situations on this sample
of tests. :-)

Use of this "intelligent" behavior of MTA we can handle mail
transactions more accuracy. This is very simple and efficient way.
But not only one. Also, we can prevent false positive rejects in spam
checks of mail systems by use methods of data mining, too.

Any questions are welcome.

> SPAM Filtering - Losing the war!
Losing the war?! I don't think so. Take the defence! :-)

Best regards, ded3axap.
Ramenskoye. Russia.

This message is part of the following thread:
	the complete thread tree sorted by date
	John W. Baxter at
	W B Hacker at