Re: [exim] exim processes mails twice

Author: Dr. Clemens Hardewig
Date:
To: exim-users
Subject: Re: [exim] exim processes mails twice

Am Dienstag, 31. Oktober 2006 23:58 schrieb Heiko Schlittermann:

> It looks like you take incoming messages, send them to spamc and spamc
> in turn sends them back to Exim via a second invocation of Exim.
>
> (It's the "classical" AMaViS model.)
>
> So from exims POV the mail is there twice, Exim can't know that it's the
> message it has already seen.)
>
> Way out?
>
> Use spamc as transport filter.
This is precisely the configuration I have:
#####################################################
### transport/30_exim4-config_spamcheck
#####################################################

spamcheck:
debug_print = "T: spamassassin_pipe for $local_part@$domain"
driver = pipe
command = /usr/sbin/exim4 -oMr spam-scanned -bS
use_bsmtp = true
transport_filter = /usr/bin/spamc
home_directory = "/tmp"
current_directory = "/tmp"
user = Debian-exim
group = Debian-exim
return_fail_output = true
log_output = true
message_prefix =
message_suffix =

> Or use SA-Exim.
> Or use the spam capabilities in ACL.
>
>
> Best regards from Dresden > Viele Grüße aus Dresden > Heiko Schlittermann

I'm quite fine with the transport filter method except the fact, that for what
reasons ever, each mail is spam scanned twice and therefore the autolearn
function of the bayesfilter is wrong/misleading (see exemplarily syslog):

Nov  1 18:50:30 server spamd[5016]: connection from localhost.localdomain 
[127.0.0.1] at port 34287
Nov  1 18:50:31 server spamd[5016]: checking message
                             ^^^^^^^^^ (see below)
 <200611011949.10622.xxxx@???> aka <YCPZEB.A.bJH.42NSFB@murphy> for 
nobody:65534.
Nov  1 18:50:38 server spamd[5016]: clean message (-2.6/5.0) for nobody:65534 
in 7.6 seconds, 2943 bytes.
Nov  1 18:50:38 server spamd[5016]: result: . -2 - BAYES_00,NORMAL_HTTP_TO_IP 
scantime=7.6,size=2943,mid=<200611011949.10622.xxxxx@???>,rmid=<YCPZEB.A.bJ
H.42NSFB@murphy>,bayes=5.55111512312578e-17,autolearn=ham
                                        ^^^^^^^^^^^^^^^ HERE OK
Now the second instance of spamd is called (why, from where??):
Nov  1 18:50:38 server spamd[5017]: connection from localhost.localdomain 
[127.0.0.1] at port 34292
Nov  1 18:50:38 server spamd[5017]: processing message
                                                         ^^^^^^^^^^^ (see 
below)
 <200611011949.10622.xxxxx@???> aka <YCPZEB.A.bJH.42NSFB@murphy> for 
Nov  1 18:50:50 server spamd[5017]: clean message (-2.6/5.0) for 
Debian-exim:65534 in 11.5 seconds, 2967 bytes.
Nov  1 18:50:50 server spamd[5017]: result: . -2 - BAYES_00,NORMAL_HTTP_TO_IP 
scantime=11.5,size=2967,mid=<200611011949.10622.xxxx@???>,rmid=<YCPZEB.A.b
JH.42NSFB@murphy>,bayes=0,autolearn=unavailable
                              ^^^^^^^^^^^^^^^^^^^^^^ here wrong and result of the fact that 
the scan results are already in the bayes db

A Mapping of the exim mail ID shows that these are the two different Mail IDs
of the two exim calls i described yesterday. Both mails are after processing
transferred to the imap server (Cyrus) which then detects the duplicate and
drops one (unf. the one with the wrong result :((()

Are the calls of spamd at all equal (NOTE: In the first call, the email is
CHECKED, in the second run, it is PROCESSED) Is there any fix in
configuration to avoid double call/processing of (portions) of spamd?

questions after questions ....

BR Clemens

This message is part of the following thread:
	the complete thread tree sorted by date
	Heiko Schlittermann at