Re: [Exim] messages killed by ALRM frozen

Top Page
Delete this message
Reply to this message
Author: Philip Hazel
Date:  
To: John Jetmore
CC: exim-users
Subject: Re: [Exim] messages killed by ALRM frozen
On Fri, 31 Oct 2003, John Jetmore wrote:

> user@???: smtp transport process returned non-zero
> status 0x000e: terminated by signal 14
>
> 14 is alarm. Went and looked at the mainlog and here are the relevant
> entries:
>
> 2003-10-31 06:44:24 1AFYdZ-00039y-N6 <= sender@??? U=web P=local S=12696282
> 2003-10-31 07:04:35 1AFYdZ-00039y-N6 SMTP timeout while connected to recip.cn [x.x.x.x] after sending data block (2931672 bytes written): Connection timed out
> 2003-10-31 07:04:35 1AFYdZ-00039y-N6 == user@??? R=lookuphost T=remote_smtp defer (110): Connection timed out: SMTP timeout while connected to recip.cn [x.x.x.x] after sending data block (2931672 bytes written)
>
> (4.24, Linux 2.4.9)
>
> The timeout itself doesn't bother me, my server was only able to send ~2
> meg of a 12 meg email to the remote server in 20 minutes. What I'm not
> sure about is why the message was frozen.


Presumbably because the subprocess crashed, as seen by the controlling
process. "Normal" timeouts should be detected and handled without
crashing. This looks like there is some bug in the code such that an
alarm did not get cancelled when it should have been. I will look
through the code, but I rather suspect this is going to be one of those
things that is hard to find...

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.
Get the Exim 4 book:    http://www.uit.co.uk/exim-book