[exim] updated [PATCH] ratelimit improvements

Top Page
Delete this message
Reply to this message
Author: Tony Finch
Date:  
To: exim-users
Old-Topics: [exim] [PATCH] ratelimit improvements
Subject: [exim] updated [PATCH] ratelimit improvements
Here's an updated version of the ratelimit patch. I've marked the
important changes relative to the last patch in the NewStuff entry below.
I haven't properly tested this version yet, but I said I'd post it today
so if you're really keen you can help me debug it :-)

|    The /noupdate option has been deprecated in favour of /readonly which has
|    slightly different semantics. The /leaky, /strict, and /readonly update

     modes are mutually exclusive. They are no longer recorded in the database,
     which may cause clashes if you are using /leaky and /strict with the same

|    key. The old /noupdate option used to add a spurious event to the rate
|    measurement; this no longer happens, though the difference is probably
|    invisible in most cases.


|    Exim now checks that the per_* options are used with an update mode that
|    makes sense for the current ACL. For example, when Exim is processing a
|    message (e.g. acl_smtp_rcpt or acl_smtp_data, etc.) you can specify
|    per_mail/leaky or per_mail/strict; otherwise (e.g. in acl_smtp_helo) you
|    must specify per_mail/readonly. If you omit the update mode it defaults to
|    /leaky where that makes sense (as before) or /readonly where required.


| The /noupdate option is still supported for backwards compatibility. It's
| equivalent to /readonly except that in ACLs where /readonly is required you
| may specify /leaky/noupdate or /strict/noupdate which are read as /readonly.


    A useful new feature is the /count= option. This is a generalization
    of the per_byte option, so that you can measure the throughput of other
    aggregate values. For example, the per_byte option is now equivalent
    to per_mail/count=${if >{0}{$message_size} {0} {$message_size} }.


| The per_rcpt option has been generalized using the /count= mechanism
| (though it's more complicated than the per_byte equivalence). When it is
| used in acl_smtp_rcpt, the per_rcpt option counts one event; if it is used
| later (e.g. in acl_smtp_data) or in a non-SMTP ACL it counts all the
| recipients together. (The /count=$recipients_count behaviour used to work
| only in non-SMTP ACLs.) Note that using per_rcpt with a non-readonly update
| mode in more than one ACL will cause the recipients to be double-counted.
| (The per_mail and per_byte options don't have this problem.)


    The major new feature is a mechanism for counting the rate of unique
    events. The new per_addr option counts the number of different
    recipients that someone has sent messages to in the last time period.
    Like the /count= option this is a general mechanism, so the per_addr
    option is equivalent to per_rcpt/unique=$local_part@$domain. You can,
    for example, measure the rate that a client uses different sender
    addresses with the options per_mail/unique=$sender_address.


    For each ratelimit key Exim stores the set of /unique= values that it
    has seen for that key. The whole set is thrown away when it is older
    than the rate smoothing period, so each different event is counted at
    most once per period. In /leaky mode, an event that causes the client
    to go over the limit is not added to the set, in the same way that the
    client's recorded rate is not updated in the same situation.


| When you combine the /unique= and /readonly options, the specific /unique=
| value is ignored, and Exim just retrieves the client's stored rate.


    The /unique= mechanism needs more space in the ratelimit database than
    the other ratelimit options in order to store the event set. The number
    of unique values is potentially as large as the rate limit, so the
    extra space required increases with larger limits.


    The uniqueification is not perfect: there is a small probability that a
    new event will appear to have happened before. For rates less than the
    limit it is more than 99.9% correct. However in /strict mode the
    measured rate can go above the limit, and this can cause Exim to under-
    count events by a significant margin. Fortunately, if the rate is high
    enough (2.7 times the limit) that the false positive rate goes above
    9%, then the over-full event set will be thrown away before the
    measured rate falls below the limit. Therefore the only harm should be
    that exceptionally high sending rates are logged incorrectly; any
    countermeasures you configure will be as effective as intended.


    The exim_dumpdb utility does not display /unique= event sets because they
    are represented in a form that makes a human-readable representation
    impossible. However you can use exim_fixdb to test membership of a set or
    to add events to it.


Tony.
--
<fanf@???> <dot@???> http://dotat.at/ ${sg{\N${sg{\
N\}{([^N]*)(.)(.)(.*)}{\$1\$3\$2\$1\$3\n\$2\$3\$4\$3\n\$3\$2\$4}}\
\N}{([^N]*)(.)(.)(.*)}{\$1\$3\$2\$1\$3\n\$2\$3\$4\$3\n\$3\$2\$4}}