Re: [exim] Default enabling of dnsdb

Autor: Phil Pennock
Data:
A: W B Hacker
CC: exim-users
Assumpte: Re: [exim] Default enabling of dnsdb

On 2009-05-07 at 03:43 +0800, W B Hacker wrote:
> - IF I were to receive incoming purporting to be from, using your
> example, globnix.com AND the rDNS passed muster, I'd presume you
> intended to send, and - while making other tests - not have a care as to
> the presence, let alone 'nuances' of an spf record.
>
> Simply put, it tells me nothing any more useful than what is already in
> front of me.

No. If the rDNS is for IP address space allocated by AfriNIC to a
company in Nigeria, then you have no reason to suspect that the mail is
potentially dodgy other than by content examination. The public
assertion that "this domain does not send mail" provides you with a data
point.

Whether or not you use that data point is up to you.

> - IF, OTOH, said arrival was a forgery, I'd not need to look at an spf
> record to determine that, either.

You have a magic gateway which detects forgeries sent from static IP
hosts with valid rDNS. I am impressed.

> - A 'legit' message would live or die on the credentials of the sending
> servrr, (rDNS, FQDN in HELO, correct format and MIME-type usage), AND
> NOT ClamAV or SA finding malware or unwanted content/attachments.
>
> - A forgery would not make it past acl_smtp_connect.

The majority of forgeries, perhaps; but not all forgeries are so
trivially and obviously wrong. I don't care so much about the 90% of
spam that is trivially rejected by such checks, I do care about the 10%
of spam which makes it past those and yet still comes in sufficiently
large volume that filtering it is desirable.

It's all about providing signal. I publish signal, you're free to use
or ignore that signal.

> Are you making this sort of callout / DNSDB lookup on all - or even a
> large percentage of traffic transiting?
>
> Surely the percentage of arrivals that might have usable information
> must be small?

"Callout" is somewhat disingenuous in a mail context, unless you're
saying that looking up any records in DNS is a callout, given the
negative connotations to SMTP callouts or other such heavyweight checks.

The cost of a DNS lookup which fails is fairly light, except in the case
of badly broken DNS setups (timeouts on requests for RR types which are
not configured, SERVFAIL, whatever) and anyone sending mail should
expect a lookup for an SPF record in any case. If there's an SPF
record, I've populated my cache for some SPF analysis stats I am
considering running. If it lets me reject mail, that's even better.

In terms of how much benefit is seen in implementing this: over the past
month this has been responsible for between 0.1% and 2.8% of my daily
rejections. This is on a server which just handles mail for myself and
my wife -- I'm no longer a paid postmaster (and no longer working on
anything related to email, either).

Peaks are (file, percentage, SPF rejections, total rejections):
rejectlog-20090403: 2.842       32 1126
rejectlog-20090421: 0.800       45 5622

I use clamav but don't currently do content-based rejections, as I get
few enough spam which make it through the other checks that I can
discard the remainder fairly safely. I don't have the skill or time to
set up decent machine-learning content analysis to the point where it
doesn't upset my paranoid side. By "discard", I mean "save it into a
spam folder" so that at least I'll have a nice *cough* large corpus by
the time I next get to spend some serious time on this.

I consider the cost of a DNS lookup to be less than the cost of
verifying a DomainKeys/DKIM signature and really to be small enough
that, at my current scale, it's worthwhile. Of course, DNS caching
capacity scales differently to CPU scaling, so this would obviously need
to be evaluated differently at mega-large providers. For most of us, a
DNS cache or two will have more than adequate resources.

Of course, I reject immediately, rather that setting an ACL variable so
I can deny at the end of the ACL but have more complete stats by knowing
how many different checks would have matched, so some of these would
have been rejected by other methods too. Reworking this to provide me
with better stats is deferred until the next occasion that I have
sufficient free time.

-Phil

Aquest missatge és part del següent fil:
	l'arbre de fils complet ordenat per data
	W B Hacker en
	W B Hacker en