Re: [exim] Spool format error in 4.77

Top Page
Delete this message
Reply to this message
Author: Kai Risku
Date:  
To: exim-users@exim.org
Subject: Re: [exim] Spool format error in 4.77
> OS? OS version? How was Exim built? How much of your configuration
> can you share with us? Compile configuration and runtime?
>
> Output of:
> exim -bP received_header_text


This is on Linux running kernel 2.6.26.6, running a spam-filtering email gateway. Originally a Fedora Core 6, but I have regularly upgraded exposed software by recompiling RPM packages. I made the RPM package for Exim 4.77 using the same specfile as I had used for version 4.71 and that version worked without problems for two years.

I don't know what of compile configuration and runtime might be relevant, but I have not changed anything when upgrading.

Unfortunately I cannot share the whole configuration file, but it is not entirely trivial. Basically exim delivers incoming email to amavisd-new over SMTP and receives them back over SMTP before doing final delivery, but then there is a lot of ACL checks (ratelimits, explicit white/blacklistings, SPF-checks, greylisting, etc.).

#  exim -bP received_header_text
received_header_text = Received: ${if def:sender_rcvhost {from $sender_rcvhost\n\t}{${if def:sender_ident {from $sender_ident }}${if def:sender_helo_name {(helo=$sender_helo_name)\n\t}}}}${if def:tls_peerdn {(${if eq{$tls_certificate_verified}{1} {Verified}{Unverified}} $tls_peerdn)\n\t}}by $primary_hostname ${if def:received_protocol {with $received_protocol}} ${if def:tls_cipher {($tls_cipher) }}(Exim $version_number)\n\tid $message_id${if def:received_for {\n\tfor <$received_for>}}

Nothing in the configuration, build process, environment or anything really has changed, I just upgraded the binary. I have tried eyeballing through a diff of the sources between 4.71 and 4.77 but have not been able to spot anything suspicious. Of course a random memory corruption due to some subtle changes e.g. in some ACL or lookup implemention are hard to find, but then I probably wouldn't be the only one to have problems. 

And the problems are really sparse. I have now had a total of four incidents within a week (three of which were within 8 hours last night) on a server receiving about 1000 emails per day (and temporarily rejecting around 1000 messages per day due to greylisting in the SMTP RCPT ACL). 

Very hard to track down what might be the problem when it cannot be reproduced at will. Perhaps it was just cosmic radiation (not using ECC memory) and a strange coincidence that it happened just after upgrading Exim. I should try to do a full memtest on the server just to be sure, but that is not the easiest thing to do on a server in production use..

--
Kai.Risku@???     GSM  +358-40-767 8282
Oy Arrak Software Ab   http://www.arrak.fi