Re: [Exim] Spamassassin

Top Page
Delete this message
Reply to this message
Author: dman
Date:  
To: exim-users
Subject: Re: [Exim] Spamassassin
On Sun, Jan 27, 2002 at 08:22:04PM -0600, Terry Shows wrote:
| I am trying to set up spamassassin to work with Exim, and the suggestions on
| this user group seem to differ in their implementation.

|
| One suggests setting up a perl script to run within an exim transport, and
| the other suggests starting up the spamd daemon.

|
| I am interested in any comments or suggestions out there to keep me from
| having to "reinvent" this procedure.

|
| Thanks in advance for all of your words of wisdom!


http://dman.ddts.net/~dman/config_docs/exim_spamassassin.html

If you want you can replace "spamc -f" with "spamassassin -P" in the
exim config.

At the core, SA is a collection of perl modules and a default config
that perform a number of operations. The distribution comes with
"spamassassin" (in /usr/bin with the debian package) which is a perl
script that takes a message on stdin, feeds it through the modules,
and dumps the result back on stdout. This is fully functional and
requires no daemon. The downside is that you have to start perl for
each and every message you filter. That takes some time and puts load
on the system. THe distribution also comes with "spamd" (also in
/usr/bin). spamd is a perl script that listens on a socket to receive
messages, feeds them through the modules, and dumps the result back to
the socket. It is a deamon, thus you start perl once on the system
and use that running process to filter all messages. In addition to
removing the startup overhead from each message, it is possible for
the deamon to handle load limiting to prevent DoSing the system (I
don't know if it does, but it is conceivable). The only problem is
that shells can't access sockets directly. Thus 'spamc', which is
also part of the SA distribution, is used as the stdin/stdout<->socket
interface layer. spamc is written in C so that it doesn't incur the
overhead of perl, but does little more than stuff stdin into a socket
and dump the socket on stdout. 'spamc' and 'spamassassin' are
interchangeable, it's just a matter of preference.

Just for a point of reference, on my machine I get the following
results (the test message is spam that SA correctly tags) :

$ time spamc -f < message > /dev/null

real    0m0.808s
user    0m0.000s
sys     0m0.010s


$ time spamassassin -P < message > /dev/null

real    0m7.425s
user    0m0.860s
sys     0m0.060s



The other difference between the techniques is that I'm using a shell
command in exim.conf with the use_shell option, but some others make a
perl script to use as the command. The perl scripts are replacements
for 'spamassassin' (that is, they access the SA modules directly), but
the nice thing is (at the start) they don't have to worry about shell
metacharacters in addresses. I think the difference between the
styles is minimal since all the perl scripts I've seen exec() exim at
the end anyways, and will run into the same problems there. In
addition, it doesn't feel right to have to create my own script to use
an existing program; I like pipes better. I'm also not a big perl fan
(I can kinda read some of it, but I've never RTFMed on modules or its
object model). The other tradeoff is those setups start a separate
perl instance for each message whereas I can use spamc instead by
changing only a handful of characters.

HTH,
-D

--

Failure is not an option. It is bundled with the software.