[exim] mail lost during nfs outage

Kezdőlap
Üzenet törlése
Válasz az üzenetre
Szerző: Jason Keltz
Dátum:  
Címzett: exim-users
Tárgy: [exim] mail lost during nfs outage
Hi.

I'm running exim 4.51 on our mail server (Linux kernel 2.4.31)
/var/mail on our mail server is mounted from our NFS file server.

The other day, during an outage of our file server, it seems that people
lost mail. In one demonstrated case, a person received a message that
contained only a "From" line. In my exim logs, I can identify the
messages that were "lost" as follows:

2005-11-07 18:32:49 1EZGRr-0006XY-P7 spam acl condition: error reading from spamd socket: Connection timed out
2005-11-07 18:32:49 1EZGRr-0006XY-P7 <= someone@??? H=(0.0.0.0) [1.1.1.1] P=smtp S=2180 id=someid
2005-11-07 18:37:49 1EZGRr-0006XY-P7 ** |procmail <user@???> R=userforward T=address_pipe: transport filter timeout while writing to pipe
2005-11-07 18:37:49 1EZGYf-0007kB-U2 <= <> R=1EZGRr-0006XY-P7 U=exim P=local S=3033
2005-11-07 18:37:49 1EZGRr-0006XY-P7 Completed
(names/IPs changed to protect the innocent)

("lost" in this case = mail was not received and wasn't queued, yet the
mail server seemed happy that the job had completed successfully)

I have spamd/clamd setup for processing, and I have them configured in
such a way (as per the standard install documentation using defer_ok) that
if they are not available, mail still goes through.

>  # Reject virus infested messages.
>  deny  message = This message contains a virus ($malware_name).
>        malware = */defer_ok

>
>  # Reject spam messages with score >= 10
>  deny  message = This message scored $spam_score spam points.
>        spam = exim:true/defer_ok
>        condition = ${if >{$spam_score_int}{100}{1}{0}}


However, I don't think that is the problem. After the mail is scanned at
the system level, it is passed through spamc a second time at the user
level with the users own spam preferences ... I think this may be what is
causing the problem..

"userforward:" is setup like this:

userforward:
driver = redirect
check_local_user
owners = root
file = $home/.forward
no_verify
no_expn
check_ancestor
allow_defer
allow_fail
file_transport = address_file
pipe_transport = address_pipe
reply_transport = address_reply

It seems the transport filter that is failing is: (I guess)

address_pipe:
driver = pipe
transport_filter = /cs/local/bin/spamc -U /tmp/spamd.sock
return_fail_output

/cs/local/bin/spamc was unavailable at the time because it was also on the
nfs server.

Can anyone suggest what might be wrong here?

Thanks,

Jason.