On Nov 2, 2004, at 11:24, Tore Anderson wrote:
> * Tor Slettnes
>> A better way is to perform the filtering at SMTP time (i.e. in the
>> Exim ACL), and reject (550) or greylist (451) suspected spam.
>
> Well, obviously we disagree on what constitutes "optimal" spam
> filtering techniques. By doing SMTP-time rejections you can pretty
> much forget about using any kind of adaptive filtering method which
> requires a feedback loop of some sort.
Not really. For one thing, running some checks at SMTP time does not
preclude you from running additional checks later (e.g. in the MUA).
(Though personally, I like to depend on SMTP-time filtering as much as
possible).
> Greylisting is a pain to deal with when you run busy sites, I
> certainly hope it won't be much more popular than it is today (at
> least not those which key off the sender host plus the envelope
> recipient). There's a reason why Yahoo Groups, for instance, have
> chosen to interpret 4xx as 5xx..
I am a bit puzzled by that reaction. In particular for busy sites (at
least those that receive a fair amount of spam) greylisting would
essentially reduce the demand for bandwidth and server resources (e.g.
you won't receive the message body for most spam; you won't need to run
content scanning such as SpamAssassin or Brightmail...).
And typically, after a particular recipient's list of contacts has been
'trained', less than 10% of their legitimate mail tends to be subject
to greylisting in the first place.
I don't think Yahoo! has a deliberate policy of no-retry deliveries
with greylisting in mind. Their SMTP behaviour dates further back than
that.
> As for bounces being collateral spam these days, I agree. And by
> doing SMTP-time rejections you create these as well, albeit
> indirectly.
Usually not - certainly not to the extend that you would if you bounced
mail yourself.
- In the case of spam/viruii delivered directly from the sender (i.e.
most of it), no legitimate MTA is involved in the first place, and no
bounces are generated.
- In the case of spam delivered via an open relay, that proxy may
generate bounces; some of which will be sent back to forged sender
addresses, some of which will get frozen in the queue of that relay.
- In the case of "false positives", the sender address is usually
legitimate; they will get the DSN via their own service provider.
Of these, the main culprit is the open relays. IMO, it is better you
let these generate collateral spam indirectly (with the problems that
entails) than doing it yourself.
'course, there is the special case of mail forwarded to you from
"trusted" sources, such as back MXes, accounts elsewhere, etc. For
that, you use host-based whitelists.
I cover a lot of these considerations here:
http://tldp.org/HOWTO/Spam-Filtering-for-MX/whysmtptime.html
http://tldp.org/HOWTO/Spam-Filtering-for-MX/forwardedmail.html
> Not to mention that everyone running relays will hate you for
> filling their queue with undeliverable bounces.
If they run _open_ relays, they deserve what they get.
"Closed" relays, such as mailing list servers, *should* be whitelisted.
This, too, can be supported per-user, without restricting each mail to
one recipient. See the links above for details.
>> * Force one recipient per message by deferring subsequent
>> recipients:
>
> That's certainly a non-option. For my own personal low-traffic
> domain, maybe, but not for the more trafficed sites I run. The users
> would never accept the delays (and missing email from sites such as
> Yahoo Groups) that limiting the envelope recipient tally to one will
> impose.
Agreed.
-tor