[EXIM] Odd trace of exim -bd -qq20s

Top Page
Delete this message
Reply to this message
Author: michael
Date:  
To: exim-users
Subject: [EXIM] Odd trace of exim -bd -qq20s
During a DoS attack of someone sending many mails, always a bunch down
one connection, I noticed an odd behaviour. I always queue all mails
and use 5 queue runners to deliver them every 20 seconds, which is more
efficient, because they reuse connections.

Unfortunately, if lots of email is sent to me, the queue runners stop
delivery for a while, then restart. The disk subsystem is only lightly
loaded and there is idle CPU time, so I started wondering. Tracing one
"exim -bd -qq20s" process, of which I have a few, showed the following:

rt_sigaction(SIGTERM, {0x804af6c, [], SA_RESTART}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGINT, {0x804af6c, [], SA_RESTART}, {SIG_DFL}, 8) = 0
write(1, "354 Enter message, ending with \""..., 56) = 56
read(3, "Received: from login_0246.aol.co"..., 8192) = 470
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(0)                                = 300
getpid()                                = 6239
time(NULL)                              = 927726783
time(NULL)                              = 927726783
open("/var/exim/spool/input/D/10me7D-0001cd-01-D", O_RDWR|O_CREAT|O_EXCL, 0600) = 4
fchown(4, 47, 47)                       = 0
fchmod(4, 0600)                         = 0
fcntl(4, F_GETFL)                       = 0x2 (flags O_RDWR)
fstat(4, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4011a000
_llseek(4, 0, {0}, SEEK_CUR)            = 0
fcntl(4, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
alarm(300)                              = 0
alarm(300)                              = 300
read(3, "\r\n!!! ACHTUNG -- Information !"..., 8192) = 1034
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300
alarm(300)                              = 300


Calling alarm this often does not look correct to me. I doubt this
causes my problems, but it is odd in itself. I am using Exim 3.00
with Linux 2.2.5 and glibc 2.1.1pre, but I think this is an Exim
issue.

Tracing an "exim -qq" process, I see this:

open("/var/exim/spool/qr10071", O_RDONLY) = 1
read(1, "\213n\0\0", 4)                 = 4
getpid()                                = 10071
kill(28299, SIGHUP)                     = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
nanosleep({1, 0}, {1, 0})               = 0
kill(28299, SIGHUP)                     = -1 ESRCH (No such process)
read(1, "", 4)                          = 0
unlink("/var/exim/spool/qr10071")       = 0
close(1)                                = 0
getpid()                                = 10071
stat("/var/exim/spool/input/k/10me3k-0001ih-01-H", {st_mode=S_IFREG|0600, st_size=959, ...}) = 0
getpid()                                = 10071
fork()                                  = 28314
getpid()                                = 10071
wait4(-1, [WIFEXITED(s) && WEXITSTATUS(s) == 8], 0, NULL) = 28314
--- SIGCHLD (Child exited) ---


Why does exim kill the process instead of waiting for it to die? The
sleep probably reduces throughput very much.

Michael

--
*** Exim information can be found at http://www.exim.org/ ***