Autor: Marc Perkel Data: A: exim-users Assumpte: [exim] Automatic Generation of White Lists and Black Lists (DNS)
I've written a new DNS whitelist/blacklist engine and have been testing
it for about a month and it's working really well and I'm thinking about
publishing it in the Wiki here. Maybe someone can do it even better than
I did once I put it up. But forst I'll describe it to see if anyone is
interested.
The idea of it is that as ham/spam email comes through your server you
look at the sending host IP addresses and send a message to a central
server with the IP and if it is ham or spam. Messages are simple as follows:
ham 1.2.3.4
spam 5.6.7.8
The sending script uses netcat and sends a message to an IP address in a
specific port. The receiving server running xinetd receives the simple
messages and appends them to a text file named karma.txt. Then every 15
minutes the karma files fed into sort and then piped into a script that
counts the ham/spam for each IP and if it's 99% ham it gets written to
zone files as ham. If it's 99% spam it gets written to zone files as
spam. It also takes 20 hits on the IP to be listed at all. The named is
reloaded and the DNS is updated.
Aging is done by rotating the karma.txt file every 6 hours heeling a
total of 12 of them representing 3 days of data. The oldest one is deleted.
This simple approach is working very well. The processing takes about 15
seconds with 4 million lines of messages. It's thousands of times faster
than MySQL which is what I used to use. And it's works well enough that
most incoming email is prescreened as either ham or spam at connect time
greatly reducing system load.
The big features is that it works and it's simple. I'm hoping that if
there's some interest in this that someone else will make improvements
in it as I'm not exactly a great programmer.
This is the simplified explanation. In practice I'm also generating
other lists. I make a "yellow" list of sources with mixed ham/spam
(gmail, yahoo, hotmail, comcast) that is useful to avoid other
blacklists for false positives. My scoring program is written in Pascal
but could be converted to Perl easilly.
With the volume of email I have it generated 2000 whitelisted IPs and
about 12,000 blacklisted IPs. But these are IPs that are hitting my
server specifically. So it does a good job for me.