Re: [exim] Exim processes hanging in 'futex' system call

Kezdőlap
Üzenet törlése
Válasz az üzenetre
Szerző: Matthias Foerste
Dátum:  
Címzett: exim-users
Tárgy: Re: [exim] Exim processes hanging in 'futex' system call
On Thu, Jul 08, 2010 at 01:57:08PM -0700, Phil Pennock wrote:
> On 2010-07-08 at 11:41 +0200, Matthias Foerste wrote:
> > It looks like __tz_convert() wants to lock something - succeeding the
> > first time, but waiting forever for the lock being released in the
> > signal handler. Someone ran 'watch -n 10 exiwhat' inside a screen
> > session and appearantly exiwhat caught the exim listener while he was in
> > tod_stamp() about once per week.
>
> Ugh.
>
> Does the problem go away if you set:
> timezone = UTC
> in the main Exim configuration?
>
> The problem appears to be that tod_stamp uses localtime() instead of
> localtime_r() but SIGUSR1 calls into this. The same problem applies to
> gmtime() vs gmtime_r() but I suspect that gmtime (used if you set
> timezone = UTC) won't be locking tz data-structures.


Today i set up a 'test environment'. It looks like gmtime() still wants
to lock something:

(gdb) bt
#0  0xb77e2416 in __kernel_vsyscall ()
#1  0xb74e7cc3 in __lll_lock_wait_private () from /lib/i686/cmov/libc.so.6
#2  0xb748549b in _L_lock_1702 () from /lib/i686/cmov/libc.so.6
#3  0xb7485291 in __tz_convert () from /lib/i686/cmov/libc.so.6
#4  0xb74836df in gmtime () from /lib/i686/cmov/libc.so.6
#5  0x08055a2d in debug_vprintf (format=0x80e5bad "%s", ap=0xbffbba64 "") at debug.c:174
#6  0x08055aa8 in debug_printf (format=0x80e5bad "%s") at debug.c:155
#7  0x0807d478 in log_write (selector=0, flags=<value optimized out>, format=0x80e5bad "%s") at log.c:712
#8  0x08062caa in usr1_handler (sig=10) at exim.c:158
#9  <signal handler called>
#10 0xb748313d in __offtime () from /lib/i686/cmov/libc.so.6
#11 0xb7485329 in __tz_convert () from /lib/i686/cmov/libc.so.6
#12 0xb74836df in gmtime () from /lib/i686/cmov/libc.so.6
#13 0x08055a2d in debug_vprintf (format=0x80e63c2 "Considering %s\n", ap=0xbffbbee4 " \004\020\b \004\020\bp��(��\234�~�\230�z�C�\004\b") at debug.c:174
#14 0x08055aa8 in debug_printf (format=0x80e63c2 "Considering %s\n") at debug.c:155
#15 0x080b3eff in verify_address (vaddr=0x8100358, f=0x0, options=<value optimized out>, callout=-1, callout_overall=-1, callout_connect=-1, se_mailfrom=0x0,
    pm_mailfrom=0x0, routed=0xbffbca3c) at verify.c:1038
#16 0x0804dcaf in acl_verify (where=<value optimized out>, addr=0xbffbcc48, arg=0x80fd358 "sender", user_msgptr=0xbffbcea8, log_msgptr=0xbffbceb0,
    basic_errno=0xbffbcc08) at acl.c:1935
#17 0x0804fd92 in acl_check_condition (verb=5, cb=0x80fd348, where=<value optimized out>, addr=0xbffbcc48, level=0, epp=0xbffbcc04, user_msgptr=0xbffbcea8,
    log_msgptr=0xbffbceb0, basic_errno=0xbffbcc08) at acl.c:3111
#18 0x0804ebd4 in acl_check_internal (where=0, addr=0xbffbcc48, s=<value optimized out>, level=0, user_msgptr=0xbffbcea8, log_msgptr=0xbffbceb0) at acl.c:3494
#19 0x0804f51d in acl_check (where=0, recipient=0x8100170 "luser@???", s=0x80fd120 "acl_check_rcpt", user_msgptr=0xbffbcea8,
    log_msgptr=0xbffbceb0) at acl.c:3668
#20 0x080a3079 in smtp_setup_msg () at smtp_in.c:3602
#21 0x08053ab7 in daemon_go () at daemon.c:506
#22 0x08068385 in main (argc=4, cargv=0xbfffd5b4) at exim.c:4136
(gdb)


Yes, i used '-d+all' to trigger the behaviour rather quickly.

>
> So yes, looks like there's a bug in the USR1 support used to implement
> the exiwhat(1) logic. I've filed bug 1007:
> http://bugs.exim.org/show_bug.cgi?id=1007


Thank you.

>
> -Phil
>
> --
> ## List details at http://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/


--
Matthias Förste