[Exim] Re: .forward files and spam leaks

Góra strony
Delete this message
Reply to this message
Autor: Alan J. Flavell
Data:  
Dla: Exim users list
Stare tematy: [Exim] .forward files and spam leaks
Temat: [Exim] Re: .forward files and spam leaks
On Mon, 1 Dec 2003, Alan J. Flavell wrote:

> As I've noted in a recent posting on a somewhat related topic: quite
> a few of our users need to maintain an account elsewhere, from which
> they set a .forward file to their local email address here.
>
> Some of the measures taken locally to keep spam out (spam-rating of
> body content, headers, envelope-sender address) are equally effective
> on such mail as on mail that's directly offered to us: but one key
> item for spam control is the address of the MTA from which the mail is
> offered - and by the time that forwarded mail is offered to us, that
> address is deeply nested in "Received" headers.


OK, so here's an update on what I had posted back then.

The chief problem with the recipe as posted before, was that the
relevant forwarding sites have several mail servers which accept
incoming mail, but these servers also pass mail amongst themselves, or
to and from a mailing list server, virus-checking/spam-content-rating
server, before they finally forward the mail to us. Sometimes there
were half a dozen Received: lines to be evaluated before reaching the
culprit IP from which they actually accepted the spam.

The original recipe was often misidentifying the particular
"Received:" header while it was trying to find the point at which the
site had accepted the inbound mail.

The revised recipe scans the Received: headers for the
[dotted.decimal.ip.address] pattern, and skips any which are internal
to the site, until it gets to one that isn't. That's then deemed to
be the IP address from which they accepted the incoming mail, and we
look that up in the DNSrbl(s).

For the purpose of discussion, let's say we are dealing with a
fictional forwarding site leaky.example whose IP addresses are
123.123.*.* (my apologies to the real site that uses those addresses).

Also within the site, we sometimes see mailers handing mail to
themselves via 127.0.0.1, so we have to skip those too; and
one might also have to deal with sites whose internal mail path
involves "private" IP networks such as 192.168.*.* etc. - I think
it will be obvious how to include those in the recipe, if you need to.

Once again, I'm anonymizing the recipes "on the fly" while posting, so
my apologies if any typo scree pin. :-}

Recommended reading: "negative lookahead" in a regex spec or tutorial
(I used Phil's PCRE documentation).

What the regex says is to match any [decimal.ip.address.pattern]
*unless* the initial substring happens to match '[123.123.' or
'[127.0.'. To include further "leaky forwarding" sites in this
recipe, just add the relevant IP prefixes to the regex - including any
"private" IP networks that they are using inside the forwarding site.

And of course add the additional sites to the initial "hosts ="
condition which bails out of the ACL stanza for MTAs to which this
recipe doesn't apply.

  warn set acl_m4 =
       hosts = *.leaky.example
       set acl_m4 = ${if match {$h_received:}\
               {\N\[(?!123\.123\.|127\.0\.)\
              (\d+)\.(\d+)\.(\d+)\.(\d+)\]\N}\
               {$4.$3.$2.$1}fail}


# at this point you have acl_m4 either empty, or containing the
# address from which the leaky.example site accepted the item.
# So you can do things like this:

  warn condition = ${if eq {$acl_m4}{} {no}{yes}}
       message = X-PH-FW: SPAM leaky forwarder, listed at \
                     $dnslist_domain=$dnslist_value $dnslist_text
       dnslists = spam.dnsbl.sorbs.net/$acl_m4  \
                   :  bl.spamcop.net/$acl_m4 \
                   : l1.spews.dnsbl.sorbs.net/$acl_m4


and rate the result in spamassassin (what we're currently doing); or
if you felt strongly enough, you could have an outright deny.

What we're actually doing is slightly more elaborate:

  warn set acl_m4 =
       hosts = *.leaky1.example : *.leaky2.example : our.secmx.example
       condition = ${if match {$h_received:}\
              {\Nleaky1\.example|leaky2.example\N} {yes}{no}}
       set acl_m4 = ${if match {$h_received:}\
               {\N\[(?!123\.123\.|...etc...|127\.0\.)\
              (\d+)\.(\d+)\.(\d+)\.(\d+)\]\N}\
               {$4.$3.$2.$1}fail}


What that does is to *also* apply the test to mail that's routed via
our backup MX, but ONLY if the mail also implicates the leaky
forwarder sites. In the IP address pattern, we have to list the IP
substrings that are used by the leaky forwarders, as well as those
used by our secondary MX. Denoted by ...etc... in the recipe shown
above.

I hope that's fairly clear? As you can see, it's only really
practical to do it in this way for a relatively limited number of
forwarding sites which are (a) of major importance to us and (b)
leaking more spam than we care to get.

I stress that this does NOT rate all mail that's coming from our
backup MX (fortunately, that's doing a pretty good job of keeping spam
out - thanks Chris ;-) - it only applies the rule if the leaky
forwarding sites were *also* implicated in the chain. But this is
only our example - others can tune this idea to their own situation,
obviously.

all the best