Re: [exim] exim 4.63 SMTP listener keeping log file open for…

Top Page
Delete this message
Reply to this message
Author: Marc Haber
Date:  
To: exim-users
Subject: Re: [exim] exim 4.63 SMTP listener keeping log file open for days
On Fri, 2 May 2008 16:06:40 +0100, Tony Finch <dot@???> wrote:
>On Fri, 2 May 2008, Marc Haber wrote:
>> On Fri, 2 May 2008 12:26:35 +0100, Tony Finch <dot@???> wrote:
>> >
>> >Is that pid actually the parent listening process?
>>
>> How can I make the user find out?
>
>Check its ppid or run exiwhat (though it it's wedged the latter won't work).


Do I still have to do that while you see the issue yourself?

>> Is there any possibility that there are more than one exim process with
>> -bd -q30m around?
>
>Yes, all the connection-handling processes have the same proctitle as the
>parent listener.


I see...

>> Inside the [snippage], I snipped a line saying
>> |exim4   4349 Debian-exim    3u  IPv4 10856266            TCP *:smtp (LISTEN)

>
>Yes I saw that in the bug report. It seems that Exim closes the listening
>sockets after forking so that might be the parent process - the lack of
>connected sockets suggests this, though that doesn't rule it out from
>being a queue runner.


In my experience, a queue runner shows up as "exim -q", even if
invoked from the main daemon.

>Well it looks like I have a live one, and it looks rather broken - some
>kind of deadlock, possibly signal related? It's a bit mysterious.
>
>COMMAND PID USER   FD   TYPE     DEVICE      SIZE       NODE NAME
>exim    363 exim  cwd    DIR        8,4       144         20 /spool/exim.in
>exim    363 exim  rtd    DIR        8,1      4096          2 /
>exim    363 exim  txt    REG        8,1    741641     384409 /opt/exim-4.67+ppsw+0/bin/exim
>exim    363 exim  mem    REG        0,0                    0 [heap] (stat: No such file or directory)
>exim    363 exim  mem    REG        8,1     42109      15972 /lib/libnss_files-2.4.so
>exim    363 exim  mem    REG        8,1     41986      15976 /lib/libnss_nis-2.4.so
>exim    363 exim  mem    REG        8,1     21283      15970 /lib/libnss_dns-2.4.so
>exim    363 exim  mem    REG        8,1      5260     383034 /opt/exim-4.67+ppsw+0/etc/etc.ppsw/db/server_params.cdb
>exim    363 exim  mem    REG        8,1     85304     356218 /opt/zlib-1.2.3+0/lib/libz.so.1.2.3
>exim    363 exim  mem    REG        8,1     13814      15961 /lib/libdl-2.4.so
>exim    363 exim  mem    REG        8,1   1412466      15955 /lib/libc-2.4.so
>exim    363 exim  mem    REG        8,1   1496021     370574 /opt/OpenSSL-0.9.8e+0/lib/libcrypto.so.0.9.8
>exim    363 exim  mem    REG        8,1    291598     370570 /opt/OpenSSL-0.9.8e+0/lib/libssl.so.0.9.8
>exim    363 exim  mem    REG        8,1    104687      15981 /lib/libpthread-2.4.so
>exim    363 exim  mem    REG        8,1    838527     369291 /opt/db4-4.2.52+0/lib/libdb-4.2.so
>exim    363 exim  mem    REG        8,1    180631      15963 /lib/libm-2.4.so
>exim    363 exim  mem    REG        8,1     47259      15959 /lib/libcrypt-2.4.so
>exim    363 exim  mem    REG        8,1     87850      15966 /lib/libnsl-2.4.so
>exim    363 exim  mem    REG        8,1     74278      15983 /lib/libresolv-2.4.so
>exim    363 exim  mem    REG        8,1     31943      15968 /lib/libnss_compat-2.4.so
>exim    363 exim  mem    REG        8,1    128633      15948 /lib/ld-2.4.so
>exim    363 exim    0u   CHR        1,3                 3661 /dev/null
>exim    363 exim    1u   CHR        1,3                 3661 /dev/null
>exim    363 exim    2u   CHR        1,3                 3661 /dev/null
>exim    363 exim    3r   REG        8,1      5260     383034 /opt/exim-4.67+ppsw+0/etc/etc.ppsw/db/server_params.cdb
>exim    363 exim    6u  sock        0,5           1978191144 can't identify protocol
>exim    363 exim    7u  sock        0,5           1978191144 can't identify protocol
>exim    363 exim    8u  unix 0xf7c5e3c0           1977360074 socket
>exim    363 exim    9w   REG        8,4 365546126      19179 /spool/exim/log/mainlog.02 (deleted)


That doesn't look like a Debian system, so can I assume that you see
the issue as well and don't need any more debugging help from our
side? Or is there anything more I can do to help?

>(gdb) bt
>#0 0xffffe410 in __kernel_vsyscall ()
>#1 0xb7bbe57e in __lll_mutex_lock_wait () from /lib/libc.so.6
>#2 0xb7b7292d in _L_mutex_lock_1783 () from /lib/libc.so.6
>#3 0xb7ad1000 in ?? ()
>#4 0xb7ba3a63 in __read_nocancel () from /lib/libc.so.6
>#5 0xb7b53e38 in _IO_file_read_internal () from /lib/libc.so.6
>#6 0xb7b70f7f in localtime () from /lib/libc.so.6
>#7 0x080a3a59 in tod_stamp ()
>#8 0x08079c16 in log_write ()
>#9 0x08060085 in usr1_handler ()
>#10 <signal handler called>
>#11 0xffffe410 in __kernel_vsyscall ()
>#12 0xb7ba2403 in __xstat64@GLIBC_2.1 () from /lib/libc.so.6
>#13 0xb7b72e60 in __tzfile_read () from /lib/libc.so.6
>#14 0xb7b71e02 in tzset_internal () from /lib/libc.so.6
>#15 0xb7b7288e in tzset () from /lib/libc.so.6
>#16 0xb7b76a86 in strftime_l () from /lib/libc.so.6
>#17 0xb7b76926 in strftime () from /lib/libc.so.6
>#18 0x00000020 in ?? ()
>#19 0x080dfbbb in ?? ()
>#20 0xbfd7b67c in ?? ()
>#21 0xb7c123a0 in free () from /lib/libc.so.6
>#22 0x00000068 in ?? ()
>#23 0xbfd7b6b8 in ?? ()
>#24 0x080a3cd7 in tod_stamp ()
>Backtrace stopped: previous frame inner to this frame (corrupt stack?)


Does it help to have my user run exim with debugging symbols (Which is
easily possible on Debian)?

Is it sufficient to run gdb <pid> and to type "bt"?

Greetings
Marc

-- 
-------------------------------------- !! No courtesy copies, please !! -----
Marc Haber         |   " Questions are the         | Mailadresse im Header
Mannheim, Germany  |     Beginning of Wisdom "     | http://www.zugschlus.de/
Nordisch by Nature | Lt. Worf, TNG "Rightful Heir" | Fon: *49 621 72739834