On Fri, 2 May 2008 16:06:40 +0100, Tony Finch <dot@???> wrote:
>On Fri, 2 May 2008, Marc Haber wrote:
>> On Fri, 2 May 2008 12:26:35 +0100, Tony Finch <dot@???> wrote:
>> >
>> >Is that pid actually the parent listening process?
>>
>> How can I make the user find out?
>
>Check its ppid or run exiwhat (though it it's wedged the latter won't work).
Do I still have to do that while you see the issue yourself?
>> Is there any possibility that there are more than one exim process with
>> -bd -q30m around?
>
>Yes, all the connection-handling processes have the same proctitle as the
>parent listener.
I see...
>> Inside the [snippage], I snipped a line saying
>> |exim4 4349 Debian-exim 3u IPv4 10856266 TCP *:smtp (LISTEN)
>
>Yes I saw that in the bug report. It seems that Exim closes the listening
>sockets after forking so that might be the parent process - the lack of
>connected sockets suggests this, though that doesn't rule it out from
>being a queue runner.
In my experience, a queue runner shows up as "exim -q", even if
invoked from the main daemon.
>Well it looks like I have a live one, and it looks rather broken - some
>kind of deadlock, possibly signal related? It's a bit mysterious.
>
>COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
>exim 363 exim cwd DIR 8,4 144 20 /spool/exim.in
>exim 363 exim rtd DIR 8,1 4096 2 /
>exim 363 exim txt REG 8,1 741641 384409 /opt/exim-4.67+ppsw+0/bin/exim
>exim 363 exim mem REG 0,0 0 [heap] (stat: No such file or directory)
>exim 363 exim mem REG 8,1 42109 15972 /lib/libnss_files-2.4.so
>exim 363 exim mem REG 8,1 41986 15976 /lib/libnss_nis-2.4.so
>exim 363 exim mem REG 8,1 21283 15970 /lib/libnss_dns-2.4.so
>exim 363 exim mem REG 8,1 5260 383034 /opt/exim-4.67+ppsw+0/etc/etc.ppsw/db/server_params.cdb
>exim 363 exim mem REG 8,1 85304 356218 /opt/zlib-1.2.3+0/lib/libz.so.1.2.3
>exim 363 exim mem REG 8,1 13814 15961 /lib/libdl-2.4.so
>exim 363 exim mem REG 8,1 1412466 15955 /lib/libc-2.4.so
>exim 363 exim mem REG 8,1 1496021 370574 /opt/OpenSSL-0.9.8e+0/lib/libcrypto.so.0.9.8
>exim 363 exim mem REG 8,1 291598 370570 /opt/OpenSSL-0.9.8e+0/lib/libssl.so.0.9.8
>exim 363 exim mem REG 8,1 104687 15981 /lib/libpthread-2.4.so
>exim 363 exim mem REG 8,1 838527 369291 /opt/db4-4.2.52+0/lib/libdb-4.2.so
>exim 363 exim mem REG 8,1 180631 15963 /lib/libm-2.4.so
>exim 363 exim mem REG 8,1 47259 15959 /lib/libcrypt-2.4.so
>exim 363 exim mem REG 8,1 87850 15966 /lib/libnsl-2.4.so
>exim 363 exim mem REG 8,1 74278 15983 /lib/libresolv-2.4.so
>exim 363 exim mem REG 8,1 31943 15968 /lib/libnss_compat-2.4.so
>exim 363 exim mem REG 8,1 128633 15948 /lib/ld-2.4.so
>exim 363 exim 0u CHR 1,3 3661 /dev/null
>exim 363 exim 1u CHR 1,3 3661 /dev/null
>exim 363 exim 2u CHR 1,3 3661 /dev/null
>exim 363 exim 3r REG 8,1 5260 383034 /opt/exim-4.67+ppsw+0/etc/etc.ppsw/db/server_params.cdb
>exim 363 exim 6u sock 0,5 1978191144 can't identify protocol
>exim 363 exim 7u sock 0,5 1978191144 can't identify protocol
>exim 363 exim 8u unix 0xf7c5e3c0 1977360074 socket
>exim 363 exim 9w REG 8,4 365546126 19179 /spool/exim/log/mainlog.02 (deleted)
That doesn't look like a Debian system, so can I assume that you see
the issue as well and don't need any more debugging help from our
side? Or is there anything more I can do to help?
>(gdb) bt
>#0 0xffffe410 in __kernel_vsyscall ()
>#1 0xb7bbe57e in __lll_mutex_lock_wait () from /lib/libc.so.6
>#2 0xb7b7292d in _L_mutex_lock_1783 () from /lib/libc.so.6
>#3 0xb7ad1000 in ?? ()
>#4 0xb7ba3a63 in __read_nocancel () from /lib/libc.so.6
>#5 0xb7b53e38 in _IO_file_read_internal () from /lib/libc.so.6
>#6 0xb7b70f7f in localtime () from /lib/libc.so.6
>#7 0x080a3a59 in tod_stamp ()
>#8 0x08079c16 in log_write ()
>#9 0x08060085 in usr1_handler ()
>#10 <signal handler called>
>#11 0xffffe410 in __kernel_vsyscall ()
>#12 0xb7ba2403 in __xstat64@GLIBC_2.1 () from /lib/libc.so.6
>#13 0xb7b72e60 in __tzfile_read () from /lib/libc.so.6
>#14 0xb7b71e02 in tzset_internal () from /lib/libc.so.6
>#15 0xb7b7288e in tzset () from /lib/libc.so.6
>#16 0xb7b76a86 in strftime_l () from /lib/libc.so.6
>#17 0xb7b76926 in strftime () from /lib/libc.so.6
>#18 0x00000020 in ?? ()
>#19 0x080dfbbb in ?? ()
>#20 0xbfd7b67c in ?? ()
>#21 0xb7c123a0 in free () from /lib/libc.so.6
>#22 0x00000068 in ?? ()
>#23 0xbfd7b6b8 in ?? ()
>#24 0x080a3cd7 in tod_stamp ()
>Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Does it help to have my user run exim with debugging symbols (Which is
easily possible on Debian)?
Is it sufficient to run gdb <pid> and to type "bt"?
Greetings
Marc
--
-------------------------------------- !! No courtesy copies, please !! -----
Marc Haber | " Questions are the | Mailadresse im Header
Mannheim, Germany | Beginning of Wisdom " | http://www.zugschlus.de/
Nordisch by Nature | Lt. Worf, TNG "Rightful Heir" | Fon: *49 621 72739834