Re: [exim] Whitelisting addresses/hosts my server has sent to

Autor: W B Hacker
Fecha:
A: exim users
Asunto: Re: [exim] Whitelisting addresses/hosts my server has sent to

Dave Pooser wrote:
> At $DAYJOB I operate a small corporate mail server-- roughly 70 users,
> roughly 1000 legitimate incoming messages per day and 3-5 times that many
> rejected as spam via Exim ACLs and SpamAssassin. Messages scoring 10+ on SA
> are rejected at SMTP time, messages scoring between 5 and 10 points (10-50
> per day) are dropped in a spambucket for me to review. All is working well.
>
> Because I am lazy, and because we work in a time-sensitive industry, I want
> to spend less time reviewing quarantined email. One way I think I can reduce
> the false positives (mainly somebody's friends or family forwarding joke
> emails with lots of pictures or lots of forwarded URLs) is by assuming that
> users and/or hosts that my users have emailed are less likely to be spammy.
> Specifically, I plan to have Exim track a list of recipient addresses and
> hosts and make sure that the hosts bypass blacklist checks and the users
> have an X-Known-Sender: header added that SpamAssassin will then recognize
> as a hamsign.
>
> (I recognize that for many ISPs the MXes will have no bearing on the sending
> hosts. That's okay; I'm not blacklisting other hosts, they just don't get
> the "get out of blacklists free" card that hosts that have received our mail
> get.)
>
> I ran through the logs for January through April, and found 17643 deliveries
> to 4383 unique addresses and 1844 unique hosts. (These numbers are wildly
> skewed by the fact that most outgoing mail was not going through Exim until
> the third week of March.) I have not yet done any analysis of how many new
> unique recipients are added each month, but it seems unlikely that would be
> more than 10%.
>
> Given those ballpark numbers, I assume that I'd find some sort of SQL
> database my best bet for tracking these "known sender" and "known host"
> databases. Is this correct, or is there a rule of thumb to tell me when an
> lsearch would be more efficient?
>
> Has anybody else put a system like this into production? Any experiences or
> pitfalls to share?

Given that corporate needs dictate we archive traffic in both directions, we
store an outbound copy in a dirtree structure that lends itself to a dlsearch
for deriving [ a | one of the ] whitelist 'score(s)'.

An SQL DB - which we use extensively otherwise - might be more effective, but
this method works well enough and fast enough for traffic volumes not greatly
different than you mention.

Moreover, it needs no SQL code or maintenance to check for the prior presence of
the same record so as to prevent duplicates, nor does it create any extra
records as traffic from more than one sender goes to the same domain or
recipient within a domain.

The effort needed to find a particular outbound message when asked is not
greatly different than it would be if we stored outbound by sender instead of
recipient, (not germane jere, but we do that also), as we don't actually need to
make such a search but a few times a year, if that.

YMMV,

Bill

Re: [exim] Whitelisting addresses/hosts my server has sent t…