ph10 2004/11/11 16:03:47 GMT
Modified files:
exim-doc/doc-txt ChangeLog
exim-src/src daemon.c
Log:
(a) Changed to using os_restarting_signal() for setting the SIGCHLD
handler in the daemon; this may fix Tony's obscure occasional crashes.
(b) Reduced the size of the race window for noticing SIGCHLDs (note: it
is not *essential* for Exim to see them).
Revision Changes Path
1.23 +8 -0 exim/exim-doc/doc-txt/ChangeLog
1.3 +25 -8 exim/exim-src/src/daemon.c
Index: ChangeLog
===================================================================
RCS file: /home/cvs/exim/exim-doc/doc-txt/ChangeLog,v
retrieving revision 1.22
retrieving revision 1.23
diff -u -r1.22 -r1.23
--- ChangeLog 11 Nov 2004 11:40:36 -0000 1.22
+++ ChangeLog 11 Nov 2004 16:03:47 -0000 1.23
@@ -1,4 +1,4 @@
-$Cambridge: exim/exim-doc/doc-txt/ChangeLog,v 1.22 2004/11/11 11:40:36 ph10 Exp $
+$Cambridge: exim/exim-doc/doc-txt/ChangeLog,v 1.23 2004/11/11 16:03:47 ph10 Exp $
Change log file for Exim from version 4.21
-------------------------------------------
@@ -85,6 +85,14 @@
matching the list had actually called for a host name (not very likely,
since this list is usually IP addresses). A host name is now passed as
"[x.x.x.x]".
+
+24. Changed the calls that set up the SIGCHLD handler in the daemon to use the
+ code that specifies a non-restarting handler (typically sigaction() in
+ modern systems) in an attempt to fix a rare and obscure crash bug.
+
+25. Narrowed the window for a race in the daemon that could cause it to ignore
+ SIGCHLD signals. This is not a major problem, because they are used only to
+ wake it up if nothing else does.
Exim version 4.43
Index: daemon.c
===================================================================
RCS file: /home/cvs/exim/exim-src/src/daemon.c,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -r1.2 -r1.3
--- daemon.c 10 Nov 2004 10:29:56 -0000 1.2
+++ daemon.c 11 Nov 2004 16:03:47 -0000 1.3
@@ -1,4 +1,4 @@
-/* $Cambridge: exim/exim-src/src/daemon.c,v 1.2 2004/11/10 10:29:56 ph10 Exp $ */
+/* $Cambridge: exim/exim-src/src/daemon.c,v 1.3 2004/11/11 16:03:47 ph10 Exp $ */
/*************************************************
* Exim - an Internet mail transport agent *
@@ -84,8 +84,8 @@
main_sigchld_handler(int sig)
{
sig = sig; /* Keep picky compilers happy */
+os_non_restarting_signal(SIGCHLD, SIG_DFL);
sigchld_seen = TRUE;
-signal(SIGCHLD, SIG_DFL);
}
@@ -1377,7 +1377,7 @@
/* Set up the handler for termination of child processes. */
sigchld_seen = FALSE;
-signal(SIGCHLD, main_sigchld_handler);
+os_non_restarting_signal(SIGCHLD, main_sigchld_handler);
/* If we are to run the queue periodically, pretend the alarm has just gone
off. This will cause the first queue-runner to get kicked off straight away. */
@@ -1624,17 +1624,34 @@
}
DEBUG(D_any) debug_printf("Listening...\n");
+
+ /* In rare cases we may have had a SIGCHLD signal in the time between
+ setting the handler (below) and getting back here. If so, pretend that the
+ select() was interrupted so that we reap the child. This might still leave
+ a small window when a SIGCHLD could get lost. However, since we use SIGCHLD
+ only to do the reaping more quickly, it shouldn't result in anything other
+ than a delay until something else causes a wake-up. */
- if ((lcount = select(max_socket + 1, (SELECT_ARG2_TYPE *)&select_listen,
- NULL, NULL, NULL)) < 0)
+ if (sigchld_seen)
+ {
+ lcount = -1;
+ errno = EINTR;
+ }
+ else
+ {
+ lcount = select(max_socket + 1, (SELECT_ARG2_TYPE *)&select_listen,
+ NULL, NULL, NULL);
+ }
+
+ if (lcount < 0)
{
select_failed = TRUE;
lcount = 1;
}
-
+
/* Loop for all the sockets that are currently ready to go. If select
- actually failed, we have set the count to 1 and a flag, so as to use the
- common error code for select/accept below. */
+ actually failed, we have set the count to 1 and select_failed=TRUE, so as
+ to use the common error code for select/accept below. */
while (lcount-- > 0)
{
@@ -1785,7 +1802,7 @@
if (sigchld_seen)
{
sigchld_seen = FALSE;
- signal(SIGCHLD, main_sigchld_handler);
+ os_non_restarting_signal(SIGCHLD, main_sigchld_handler);
}
/* Handle being woken by SIGHUP. We know at this point that the result