Re: [EXIM] accept() errors instead of accepting when network…

Page principale
Supprimer ce message
Répondre à ce message
Auteur: Exim Users Mailing List
Date:  
À: exim-users
Sujet: Re: [EXIM] accept() errors instead of accepting when network unreachable
[ On , November 21, 1998 at 23:55:52 (-0000), D. J. Bernstein wrote: ]
> Subject: Re: [EXIM] accept() errors instead of accepting when network unreachable
>
> Greg A. Woods writes:
> > * Interactive UNIX 2.2 has a bug in accept(). If accept() is
> > * interrupted by an alarm signal, accept() does not return from
> > * waiting for a connection with errno set to EINTR.
>
> That's the standard BSD 4.2 signal behavior.


What's standard 4.2bsd? The "bug" in accept() (i.e. that it doesn't
return with errno==EINTR on SIGALRM)?

On ISC Unix, if I remember correctly, accept() returns, but with errno
either set to effectively a garbage value.

On SunOS-4, SunOS-5, and NetBSD (and seemingly on SCO Unix, Linux, and
many other systems) accept() does seem to return with errno==EINTR when
the process receives a SIGALRM (though I've not verified this directly,
only through inference that smail works as expected).

> Restarting became optional in BSD 4.3 with siginterrupt() and in later
> systems with SA_RESTART in sigaction(). Try it!


I don't want an automatic restart -- the code's ready to handle EINTR,
and if the same daemon is to handle the queue then it needs the EINTR if
it's not getting connections often enough to trip the loop.

(On 4.3BSD Smail is #ifdef'ed to optionally call
siginterrupt(SIGALRM,1); though I don't know of any standard
configuration where this was done.)

> > This made me suspect that the
> > code which should have set errno to EINTR and returned from accept() was
> > doing other evil things instead -- evil things which eventually resulted
> > in enough corruption that the TCP stack was useless.
>
> Other than this wild speculation, is there any substance to your claim
> of ``lots of STREAMS based TCP/IP stacks where completely killing and
> restarting the daemon, or even rebooting sometimes, is still required
> when accept(2) gets itself tied in a knot''?


The fact that the system needed a reboot is *not* "wild speculation" --
it's a proven fact (witnesses and references available on request).

You've already corroborated the need to "kill and restart the daemon",
which in this case is has the same effect as closing and re-opening the
socket -- the fix you recommended for SunOS-5.something.

I can't think of any systems other than ISC Unix where the reboot was
necessary for only this reason (SunOS-5.3 needed reboots too, but I'm
not sure if the reasons were the same -- luckily I never had to use that
piece of junk very much).

-- 
                            Greg A. Woods


+1 416 218-0098      VE3TCP      <gwoods@???>      <robohack!woods>
Planix, Inc. <woods@???>; Secrets of the Weird <woods@???>


--
*** Exim information can be found at http://www.exim.org/ ***