Re: [Exim] remote delivery process count got out of step

Top Page
Delete this message
Reply to this message
Author: Philip Hazel
Date:  
To: Bernhard Erdmann
CC: exim-users@exim.org, frank
Subject: Re: [Exim] remote delivery process count got out of step
On Wed, 28 Aug 2002, Bernhard Erdmann wrote:

> 2002-08-28 12:18:54 17jzuY-0000Og-00 <= be@???
> H=apollo.berdmann.de [192.168.1.2] P=esmtp S=4512
> id=E17jzuY-0001Lb-00@???
> 2002-08-28 12:18:55 17jzuY-0000Og-00 remote delivery process count got
> out of step
>
> It happened using Exim 3.35 on Linux 2.4.19.


This was a problem that was reported and fixed for AIX. It is all
concerned with signalling the end of a child process to the parent. In
Exim 4.05 the following change was made:

 6. On systems where SA_NOCLDWAIT is defined, changed from using signal(
    SIGCHLD, SIG_DFL) to using sigaction(), with flags explicitly set zero, to
    ensure that SA_NOCLDWAIT is definitely off. This fixes a bug in AIX where
    subprocesses were disappearing without being turned into zombies for Exim
    to reap. There was a previous report of the error "remote delivery process
    count got out of step" on a Linux box that was never resolved. It is
    possible that this change fixes that too.


I never had any confirmation that it did fix the Linux problem, but
nobody else has reported it, until today.

> What does "remote delivery process count got out of step" mean?


It means that the main Exim process thought it had one or more
subprocesses running, but when it came to wait for them, the operating
system told it that it had no children.

[There was a later similar problem that showed up under Linux when Exim
was being straced. This is fixed for 4.11, but unless you are stracing,
it isn't relevant.]

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.