[ On Sunday, July 6, 2003 at 19:19:44 (+0100), Alan J. Flavell wrote: ]
> Subject: Re: [Exim] error response rate limiting vs. overloaded systems
>
> On Sun, 6 Jul 2003, Greg A. Woods wrote:
>
> > You've missed the key point: "and so that the errant client can't come
> > right back at you with another attempt to cause the same error."
>
> "can't"? They most certainly *can*.
> Multiprogramming isn't exactly a new idea.
Well, yes, OK: "can, but almost never will".
Clients would have to be extremely badly behaved to re-connect to the
same server immediately upon receiving a 5xx error from that server. I
doubt even spammers would see any benefit to trying this, and there's
little for DoS attackers to gain either, and I've not encountered any
normal clients that are so badly programmed. Indeed all clients I've
encountered to date, including spamware, are well enough behaved that
they will wait patiently for the very last line of a multi-line 5xx
error response (even those so broken that only report the first line in
their own bounce or error message!):
time: stream:
0 RCPT TO:<no-such-user@target>
1 550-5.1.1 reject sending to address '<no-such-user@target>'.
2 550-5.1.1 The address <no-such-user> was not accepted.
3 550-5.1.1 Reason given was: (ERR_100) no such user or mailbox.
63 550 5.1.1 Permanent failure logged.
Remember that almost all the older Internet protocols, SMTP included,
were designed with the assumption that each party would be well behaved
and would fully honour even the mis-intrepretations of the Robustness
Principle, never mind its true intent at the protocol level. These days
peer pressure won't force all authors of misbehaving software to fix
their bugs, nor will it force all users to stop using such misbehaving
software. Indeed many users pay good money for misbehaving software and
they sill feel they got their money's worth! This goes double for
spamware of course. Today we have to learn to design and implement our
protocols with the assumtpion that the peer or client will be
misbehaved, whether by accident or maliciously. As a society we now put
criminals in jail mostly to prevent them from re-offending right away
(and partly to show that they can't get away with offending in the first
place) and so we must do something similar to "broken" software. See,
for instance, the discussion in RFC 3117, especially section 4.5.
> If you're so right that an
> inactive connection on your mail server is such a tiny use of
> resources, which I'm prepared to believe, then presumably the inactive
> connection at the spammer's end is also such a tiny use of resources
> that they can just carry on and open another dozen - or hundred -
> insecure proxies at the same time, and carry on down their list of
> addresses?
This is true -- and this is why I keep repeating that the primary reason
for implementing full error response rate limiting is to protect the
server. It is only gravy if it also has a "good network neighbour"
effect by slowing down a spammer who doesn't employ multiple queue
runners. Indeed the use of response rate limiting when the problem is a
client which re-connects immediately to retry the same failed
transaction also reduces the overall system load on the misbehaving
client. Whether this is good or bad is irrelevant -- it is necessary.
(obviously for full DoS and DDoS protection the server must also
implement connection timeouts and the postmaster must still be prepared
to firewall attackers to prevent them from simply opening the maximum
number of idle connections and immediately re-opening them on close --
but that's a different threat profile than we're really discussing here)
> When I saw a spammer connecting to our mail server at approx 20 minute
> intervals, and churning their way down about three dozen addresses per
> call, irrespective of the fact that our mailer was saying, in effect,
> "5xx go away you're an open proxy" to every request they made, and
> leaving our log full of nuisance rejection entries, I thought it would
> be a nice idea to slow them down. In fact, I slowed them down to 4
> minutes per RCPT request, i.e a couple of hours per call, but they
> still made a fresh connection after about 20 minutes, and another
> after 20 minutes again, and at the end of the day they had still
> churned their way down the same number of addresses as before, just
> that they'd used several simultaneous connections (via different open
> proxies) to do it. So we ended up with some half-dozen calls, from
> different open proxies - but on looking at the log, evidently coming
> from the same prime cause - all churning their way slowly down their
> list of addresses, and all getting our 5xx go away responses.
Well at least they were not actually sending you any messages! ;-)
> Trying to drop the call was even worse - they'd come right back and
> repeat the same addresses over and over and over.
Of course -- that's why some kind of rate limiting and controlled error
response is necessary during each connection. What you were doing was
essentially a partial form of what I'm doing -- you were rejecting RCPT
commands, but only after a delay. Doing this universally for every
error response ensures that other kinds of bad behaviour don't also
result in the same kind of hammering of reconnections.
> OK, that's just one example. Maybe you think it was atypical.
no, probably not atypical -- except for the part about feeling that the
reject entries in your log were a nuisance (though perhaps that's not
atypical either, though it should be).
Consider that the spammer may actually have been trying to deliver a new
and different spam every 20 minutes! ;-) (I've seen that happen,
especially from the triple-x spammers, though the spam content was only
very slightly different and possibly was mechanically generated. I
usually just blocked the sender though and didn't bother to see how long
it would go on for.)
> I'm not disputing that, but my real worry is whether it's really
> achieving anything - since, as you say yourself, the inactive TCP
> connection uses very little resource, so presumably it causes the
> spammer no more distress that it causes oneself? Indeed less distress
> to them, since most of them are using someone else's computers to do
> their work.
If you've ever had a misbehaving client hammer at your server with
multiple connections per minute (or more ;-), then error response rate
limiting is indeed worthwhile as it protects your server from such
misbehaviour.
I suspect such misbehaviour of clients is actually quite common on
larger sites which use any kind of SMTP level errors to reject spammers,
though hopefully Avleen's logs will give us a better picture. It may
even be common enough to account for what seems to me to be extrememly
inflated numbers of connections given the claimed size of the user base.
On medium-sized sites, like the one I've studied in-depth, this kind of
mis-behaviour is less common, but even if it only happened once a week
it was still causing major friction with angry customers who were
complaining frequently to management. Since alleviating this problem
things have gone much more smoothly.
Eventually, if enough sites, and especially large sites frequently
targeted by spammers, use error response rate limiting agressively, then
spammers will be hampered. It'll be like the clam's one big powerful
muscle fighting the starfish with many highly redundant tiny little
muscles. Eventually the clam will give up because it is one against
many. I.e. the gravy only comes in the very long term after great
persistence, but with the redundancy of many hands we can make short
work of the problem.
--
Greg A. Woods
+1 416 218-0098; <g.a.woods@???>; <woods@???>
Planix, Inc. <woods@???>; VE3TCP; Secrets of the Weird <woods@???>