Re: [exim] Issues with greylisting - NEW IMPLEMENTATION

Αρχική Σελίδα
Delete this message
Reply to this message
Συντάκτης: Alain Williams
Ημερομηνία:  
Προς: Richard Clayton
Υ/ο: exim-users
Αντικείμενο: Re: [exim] Issues with greylisting - NEW IMPLEMENTATION
On Thu, Jan 28, 2010 at 04:55:38PM +0000, Richard Clayton wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> In message <20100128155532.GD542@???>, Alain Williams
> <addw@???> writes
>
> >Notes for discussion:
> >
> >* It stores sending-domain & IP address of sender.
>
> I've seen combining the IP address with sending (or receiving) domain
> work very badly indeed with ISP smarthosts (ie the machines that
> millions of customers use...)


Hmmm. An ISP might want to use the triplet: destination domain, sender domain
& relaying IP. A spammer will send to many addresses, if 2 of them are hosted
by the ISP then only the first tried will be protected by greylisting.

I will produce an alternate macro/database that accepts the 3.
Although it may be simpler to just have the one that everyone can use.

> What happens is that the sending machine tries one email, which is then
> greylisted. The sending machine then marks the destination as
> unresponsive -- but eventually gets around to trying again. However, a
> different email is at the front of the queue, with a different customer
> domain and so that is also greylisted. The sending machine then marks
> the destination as unresponsive -- but eventually gets around to trying
> again. However, a different email is at the front of the queue...
>
> ... rinse and repeat until 4xx has been seen far too often, and all
> queued email is then marked undeliverable and returned to the senders.
>
> I don't understand why you feel that the property "will try again after
> a 4xx response" would not be associated solely with the IP address ??


So: are you suggesting that the only thing that should be stored in the database
is the relaying IP address ? That would seem to address your concern above,
however what happens if a group of machines behind one IP address (a small
business with a NATting firewall) become part of a spamming botnet ?
The first attempt will be blocked and the next ones be allowed through.

The pair (relay_ip & sender_domain) tends to be more robust since spammers
tend to set the sender_domain ''at random'', whereas true correspondents
keep their sender_domain.

> If your concern is dynamic IP addresses (and you might do better to
> subscribe to a service that allows you to block those wholesale) then
> just age your database entries a bit more aggressively.


greylisting should be done after RBL checks - which tend to blacklist dynamic IP
addresses.

> Note also that the bad effect discussed above is often hidden on Exim
> systems because an attempt will be made to deliver any new emails
> whatever the retry state and if one of them works then all the contents
> of the queue will be promptly retried... but when queues build up then
> nothing gets retried for hours and then aggressive aging of greylist
> entries will seriously hurt you!
>
> ... recipients can of course improve matters considerably by not
> applying greylisting to the major sources of incoming email (ie: all the
> reputable local ISPs).


That list of ''reputable local ISPs'' will be built up in the database
once it has been running for a bit. However getting a list of these to
prefill the database may not be easy. Perhaps a ''learn'' phase of a day
or so before taking notice (ie doing a defer) would do it, and then have
a long expiry time for records.
This list is nothing to do with ''local'' - but more a reflection of MTAs
for the correspondents of those for whom this MTA works. They will, at some
point, acquire a new correspondent at which time the learning experience will
need to happen with your described problems.


I am running this in 'test' on a couple of real machines, logging into the database
but not doing greylisting. The schema is different from what I have published,
they key is: from_domain, sender_host_ip, to_domain
The table also contains a rcpt_count -- counts the number of times the row is updated.
I can make these available if people wish.

I am genuinely interested in trying to get the 'best' solution, so help would
be appreciated.

Regards

--
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256 http://www.phcomp.co.uk/
Parliament Hill Computers Ltd. Registration Information: http://www.phcomp.co.uk/contact.php
Past chairman of UKUUG: http://www.ukuug.org/
#include <std_disclaimer.h>