On Fri, 2 Nov 2001, Greg A. Woods wrote:
> The default action on *BSD (and POSIX and SuSv2) of SIGCHLD is SIG_IGN.
But it appears that SIG_IGN does different things on different OS. On
*BSD it does not cause re-parenting (to 'init') of subsequently created
children; on some other OS (e.g. Solaris) it does. At least, that's what
my tests appear to indicate... and aha! it is documented. This is from
the Solaris 8 man page for signal:
If any of the above functions are used to set SIGCHLD's
disposition to SIG_IGN, the calling process's child
processes will not create zombie processes when they ter-
minate (see exit(2)).
Thanks for the useful summary.
> I find though that it's much better to explicitly catch SIGCHLD and call
> one of the wait() family as appropriate from within the handler:
I don't need to use SIGCHLD in Exim. The problem arises only when a
single incoming SMTP process reads lots of messages and fires off a
delivery process for each one. If there are only a few messages, the
receiving process will have died long before they finish. However, when
there are a lot of delivery processes, it's helpful to reap those that
have completed, to prevent the system filling up with zombies. It is
trivial to put
while (waitpid(-1, NULL, WNOHANG) > 0);
inside the SMTP receiving loop. Indeed this was already present in the
loop for TCP/IP reception; this thread started because Sheldon was
firing in lots of messages using -bS, and I'd forgotten about it for
that loop.
> On systems without waitpid()
There's been a waitpid() call in Exim for a long time, so all the
systems it is used on must have it nowadays.
--
Philip Hazel University of Cambridge Computing Service,
ph10@??? Cambridge, England. Phone: +44 1223 334714.