Re: [exim] ratelimit is counting wrong

Kezdőlap
Üzenet törlése
Válasz az üzenetre
Szerző: Graeme Fowler
Dátum:  
Címzett: exim-users
Tárgy: Re: [exim] ratelimit is counting wrong
On Wed, 2008-10-22 at 16:23 +0200, Marten Lehmann wrote:
> Just tell me how you would send a 0.4 portion of an email. So exim
> definetely does interpolate which I didn't ask it to do.


Step - away - from - the - frustration!

http://www.exim.org/exim-html-current/doc/html/spec_html/ch40.html#SECTratelimiting

"The syntax of the ratelimit condition is:

ratelimit = <m> / <p> / <options> / <key>

If the average client sending rate is less than m messages per time
period p then the condition is false; otherwise it is true."

"The parameter p is the smoothing time constant, in the form of an Exim
time interval, for example, 8h for eight hours. A larger time constant
means that it takes Exim longer to forget a client’s past behaviour. The
parameter m is the maximum number of messages that a client is permitted
to send in each time interval. It also specifies the number of messages
permitted in a fast burst. By increasing both m and p but keeping m/p
constant, you can allow a client to send more messages in a burst
without changing its overall sending rate limit. Conversely, if m and p
are both small, messages must be sent at an even rate."

The way the results are stored is as a *rate*, not as a hard counter,
and that rate is smoothed and decayed over the time period "p" you
define in the config.

Over period "p", the contribution of the early points in the data
becomes smaller until they fall out of the calculation - hence how you
can end up with fractional records.

As an example, I have Exim configured to defer hosts which have
connected more than 10 times in 1 minute:

  warn  ratelimit   = 0 / 1m / strict / noupdate
        log_message = Connect rate $sender_rate / $sender_rate_period


  defer ratelimit   = 10 / 1m / strict
        message     = Connecting too fast, please try again in 1 minute.


Then using netcat running in a loop thus:

while [ 1 ]; do sleep 6; echo "QUIT" | nc -v my.host 25; done

This creates a connection rate of just under 10 per minute - this short
period illustrates quite well how the maths works underneath the hood.
It takes significantly more connections than just 10 to get the rate up
to being close to 10/minute. Please excuse the log here, but it shows
the decaying contribution of older data points as time progresses.

First connection the rate is 0:

2008-10-22 15:58:12 Connect rate 0.0 / 1m
2008-10-22 15:58:19 Connect rate 0.9 / 1m
2008-10-22 15:58:25 Connect rate 1.8 / 1m
2008-10-22 15:58:31 Connect rate 2.6 / 1m
2008-10-22 15:58:37 Connect rate 3.3 / 1m
2008-10-22 15:58:43 Connect rate 3.9 / 1m
2008-10-22 15:58:49 Connect rate 4.5 / 1m
2008-10-22 15:58:56 Connect rate 5.0 / 1m
2008-10-22 15:59:02 Connect rate 5.4 / 1m
2008-10-22 15:59:08 Connect rate 5.9 / 1m

Now we're a minute in, and the rate is only just over 6 due to the
moving average calculation:

2008-10-22 15:59:14 Connect rate 6.2 / 1m
2008-10-22 15:59:20 Connect rate 6.6 / 1m
2008-10-22 15:59:27 Connect rate 6.9 / 1m
2008-10-22 15:59:33 Connect rate 7.2 / 1m
2008-10-22 15:59:39 Connect rate 7.4 / 1m
2008-10-22 15:59:45 Connect rate 7.6 / 1m
2008-10-22 15:59:51 Connect rate 7.8 / 1m
2008-10-22 15:59:57 Connect rate 8.0 / 1m
2008-10-22 16:00:04 Connect rate 8.2 / 1m
2008-10-22 16:00:10 Connect rate 8.3 / 1m

Two minutes, still not up to 10:

2008-10-22 16:00:16 Connect rate 8.4 / 1m
2008-10-22 16:00:22 Connect rate 8.6 / 1m
2008-10-22 16:00:29 Connect rate 8.7 / 1m
2008-10-22 16:00:35 Connect rate 8.8 / 1m
2008-10-22 16:00:41 Connect rate 8.9 / 1m
2008-10-22 16:00:47 Connect rate 8.9 / 1m
2008-10-22 16:00:53 Connect rate 9.0 / 1m
2008-10-22 16:01:00 Connect rate 9.1 / 1m
2008-10-22 16:01:06 Connect rate 9.1 / 1m
2008-10-22 16:01:12 Connect rate 9.2 / 1m

Three minutes:

2008-10-22 16:01:18 Connect rate 9.3 / 1m
2008-10-22 16:01:24 Connect rate 9.3 / 1m
2008-10-22 16:01:31 Connect rate 9.3 / 1m
2008-10-22 16:01:37 Connect rate 9.4 / 1m
2008-10-22 16:01:43 Connect rate 9.4 / 1m
2008-10-22 16:01:49 Connect rate 9.4 / 1m
2008-10-22 16:01:55 Connect rate 9.4 / 1m
2008-10-22 16:02:02 Connect rate 9.5 / 1m
2008-10-22 16:02:08 Connect rate 9.5 / 1m
2008-10-22 16:02:14 Connect rate 9.5 / 1m

Four minutes:

2008-10-22 16:02:20 Connect rate 9.5 / 1m
2008-10-22 16:02:27 Connect rate 9.5 / 1m
2008-10-22 16:02:33 Connect rate 9.5 / 1m
2008-10-22 16:02:39 Connect rate 9.5 / 1m
2008-10-22 16:02:45 Connect rate 9.5 / 1m
2008-10-22 16:02:51 Connect rate 9.6 / 1m
2008-10-22 16:02:58 Connect rate 9.6 / 1m
2008-10-22 16:03:04 Connect rate 9.6 / 1m
2008-10-22 16:03:10 Connect rate 9.6 / 1m

Five minutes:

2008-10-22 16:03:16 Connect rate 9.6 / 1m
2008-10-22 16:03:23 Connect rate 9.6 / 1m
2008-10-22 16:03:29 Connect rate 9.6 / 1m

...and the rate steadies off at 9.6/9.7 due to the execution and
transport overheads involved in running in a loop like this.

If I switch to a one second loop instead, again we start at zero:

2008-10-22 16:08:02 Connect rate 0.0 / 1m
2008-10-22 16:08:03 Connect rate 1.0 / 1m
2008-10-22 16:08:04 Connect rate 2.0 / 1m
2008-10-22 16:08:06 Connect rate 2.9 / 1m
2008-10-22 16:08:07 Connect rate 3.8 / 1m
2008-10-22 16:08:08 Connect rate 4.8 / 1m
2008-10-22 16:08:09 Connect rate 5.7 / 1m
2008-10-22 16:08:10 Connect rate 6.5 / 1m
2008-10-22 16:08:11 Connect rate 7.4 / 1m
2008-10-22 16:08:13 Connect rate 8.2 / 1m

We've now had 10 connections but the rate is *below* 10/minute due to
the diminishing contribution of the older data points.

2008-10-22 16:08:14 Connect rate 9.1 / 1m
2008-10-22 16:08:15 Connect rate 9.9 / 1m
2008-10-22 16:08:16 Connect rate 10.7 / 1m

Aha! Now we've exceeded the rate, and this loop continues to around
53/minute before levelling out.

Is it starting to make sense now?

It looks like your config is doing exactly what you've configured it to
do, but that you expected it to do something else.

I hope that helps :)

Graeme