Re: [exim] Re: Exim 4.42 sleeping "forever"

Góra strony
Delete this message
Reply to this message
Autor: Alex Kiernan
Data:  
Dla: exim-users
Temat: Re: [exim] Re: Exim 4.42 sleeping "forever"
On Sat, 02 Oct 2004 19:34:09 +0100, Andrew - Supernews
<andrew@???> wrote:
> >>>>> "Philip" == Philip Hazel <ph10@???> writes:
>
> > On Thu, 30 Sep 2004, Alex Kiernan wrote:
> >> We (think) we understand... setitimer is supposed to return EINVAL
> >> for non-canonical values
> >> (http://www.unix.org/onlinepubs/009695399/functions/setitimer.html),
> >> checking on Solaris 9 and FreeBSD 4.8, that is indeed the case
> >> (and its *not* on Fedora Core 2), so we end up going to sleep
> >> forever, having ignored the return from setitimer and end up
> >> waiting for an event which will never happen.
>
> Philip> Thanks for the report and for the further information. I've
> Philip> noted it for investigation.
>
> The values in that debug output, if they can be believed at all,


I think they can - if you chase up the frames & do the maths by hand,
you end up with the numbers you see.

Only... scrolling back, I see I neglected to include all the detail :(

Here's a different one where I just printed the relevant things:

(gdb) where
#0  0x28367a48 in sigsuspend () from /usr/lib/libc.so.4
#1  0x805b961 in milliwait (itval=0xbfbff010) at exim.c:214
#2  0x805babf in exim_wait_tick (then_tv=0x80e4578, resolution=5000)
    at exim.c:322
#3  0x8082908 in receive_msg (extract_recip=0) at receive.c:2870
#4  0x804f70d in handle_smtp_call (listen_sockets=0x80f3080,
    listen_socket_count=1, accept_socket=1, accepted=0xbfbff3b0)
    at daemon.c:466
#5  0x8051250 in daemon_go () at daemon.c:1638
#6  0x8060c8c in main (argc=5, cargv=0xbfbffbac) at exim.c:3667
#7  0x804bf0e in _start ()


(gdb) up
#1  0x805b961 in milliwait (itval=0xbfbff010) at exim.c:214
214     in exim.c


(gdb) print *itval
$3 = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 147,
    tv_usec = -106474}}


(gdb) up
#2  0x805babf in exim_wait_tick (then_tv=0x80e4578, resolution=5000)
    at exim.c:322
322     in exim.c


(gdb) print *then_tv
$4 = {tv_sec = 1096389404, tv_usec = 785000}

(gdb) print now_tv
$5 = {tv_sec = 1096389257, tv_usec = 895000}

(gdb) print now_true_usec
$6 = 896474

(gdb) print itval
$7 = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 147,
    tv_usec = -106474}}


(gdb) up
#3  0x8082908 in receive_msg (extract_recip=0) at receive.c:2870
2870    receive.c: No such file or directory.


(gdb) print id_resolution
$8 = 5000

(gdb) print message_id_tv
$9 = {tv_sec = 1096389404, tv_usec = 785000}

(gdb)


> should, if I'm reading the code right, only be possible if the system
> clock was stepped substantially backwards during the process. The code
> in exim_wait_tick is approximately correct _only_ for the case where
> the time "then" is no more than "resolution" microseconds newer than
> the rounded-down value of "now", i.e. the required delay is less than
> "resolution" (which is the normal case when the clock is monotonic).
>
> I think there's a logical flaw involved in trying to handle the "clock
> went backwards" case in any event.
>


Ah...

The box got rebooted, ntpd started, exim started, ntpd delta'd the
time backwards -178.829209s, exim had already started receiving
messages, the world got confused.

--
Alex Kiernan