[exim-dev] race condition / crash in 4.86-RC4

Top Page
Delete this message
Reply to this message
Author: Tony Finch
Date:  
To: exim-dev
Subject: [exim-dev] race condition / crash in 4.86-RC4
I tried to roll out 4.86 yesterday and encountered a fairly nasty bug.

There appears to be some kind of bug in the locking that sometimes makes
Exim stumble when a queue runner encounters another process working on a
message. I have seen this before, though it seemed to be benign enough
that I haven't tried to fix it yet.

With my updated build, Exim occasionally crashed. This seems to occur
after delivery has completed (the message got where it was going and there
was an entry in the message's -J journal file) but it failed before
getting the delivery status to the parent process. So the log (example
below) says == because of the crash and the => is missing because of a
mismatch between journalling and loggin the delivery.

I shall investigate further.

2015-07-21 00:25:01 +0100 1ZHKQf-0004fX-YV <= (redacted) H=(redacted) I=[131.111.8.139]:25 P=esmtps X=TLSv1:AES256-SHA:256 CV=no S=1132 id=E1ZHKQf-0005K3-Cs@(redacted)
2015-07-21 00:25:01 +0100 1ZHKQf-0004fX-YV Spool file is locked (another process is handling this message)
2015-07-21 00:25:01 +0100 1ZHKQf-0004fX-YV == (redacted) R=hermes_lmtp T=hermes_lmtp defer (-1): smtp transport process returned non-zero status 0x000b: terminated by signal 11
2015-07-21 00:25:01 +0100 1ZHKQf-0004fX-YV Frozen
2015-07-21 00:25:09 +0100 1ZHKQf-0004fX-YV Message is frozen
2015-07-21 00:33:55 +0100 1ZHKQf-0004fX-YV Unfrozen by forced delivery
2015-07-21 00:33:55 +0100 1ZHKQf-0004fX-YV Completed

Tony.
--
<fanf@???> <dot@???> http://dotat.at/ ${sg{\N${sg{\
N\}{([^N]*)(.)(.)(.*)}{\$1\$3\$2\$1\$3\n\$2\$3\$4\$3\n\$3\$2\$4}}\
\N}{([^N]*)(.)(.)(.*)}{\$1\$3\$2\$1\$3\n\$2\$3\$4\$3\n\$3\$2\$4}}