Hallo *!
Configured is a dynamic Blacklist with postgresql.
problem:
- some exim processes hang and using 100% CPU.
details:
- only very few processes concerned. [2]
- strace [3] show too much polls/sec.,
guessing to sql-server
- netstat [4] show connections:
- sql-server: connection not exist
- mta: connection state: CLOSE_WAIT
(for several hours. see [5])
summary:
- exim poll on half-closed connection.
suggestion:
- reduce this polling (usleep)
- add timeout to poll. like:
if (pollcount > pollmax) { printlog("Error: sql-connect dead"); exit;}
regards heiko
------------------------------------------------------------------
- [1] sysinfo
- exim 4.50
- debian
- [2] top
top - 10:52:49 up 43 days, 21:20, 2 users, load average: 3.75, 2.53, 1.26
Tasks: 359 total, 6 running, 353 sleeping, 0 stopped, 0 zombie
Cpu(s): 54.8% us, 43.2% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 2.0% si
Mem: 393356k total, 389892k used, 3464k free, 5008k buffers
Swap: 65528k total, 3240k used, 62288k free, 112164k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14875 Debian-e 25 0 9476 2244 1740 R 32.9 0.6 2:47.91 exim4
14273 Debian-e 25 0 9476 2248 1744 R 32.2 0.6 2:48.49 exim4
26421 root 25 0 29860 25m 2468 R 17.9 6.6 5:19.82 spamd
28736 root 25 0 30860 26m 2448 R 15.9 6.9 0:55.85 spamd
18269 root 16 0 2264 1260 848 R 0.3 0.3 0:00.02 top
18274 Debian-e 15 0 9460 912 472 S 0.3 0.2 0:00.01 exim4
- [3] strace
smtp # strace -p14273
poll([{fd=10, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recv(10, "", 1, 0) = 0
poll([{fd=10, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recv(10, "", 1, 0) = 0
poll([{fd=10, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recv(10, "", 1, 0) = 0
poll([{fd=10, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recv(10, "", 1, 0) = 0
poll([{fd=10, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recv(10, "", 1, 0) = 0
poll([{fd=10, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recv(10, "", 1, 0) = 0
poll([{fd=10, events=POLLIN|POLLERR, revents=POLLIN}], 1, -1) = 1
recv(10, "", 1, 0) = 0
- ca. 100/s
- [4] netstat
wie to finde exim-pid - exim-message?
smtp # netstat -anep | grep 14273
tcp 1 0 x.x.x.56:25 81.213.219.21:34507 CLOSE_WAIT
103 52561311 14273/exim4
tcp 0 0 x.x.x.56:37907 x.x.x.73:5432 CLOSE_WAIT 103
52563694 14273/exim4
sql # netstat -anep | grep 37907
sql #
- on MTA connection state CLOSE_WAIT
on SQL-Server connection not exist
- [5] exim_mainlog
smtp # grep 81.213.219.21 /var/log/exim4/mainlog | tail | sed
"s/foo/mydomain/"
2007-10-01 08:48:19 H=(dsl.static8121321921.ttnet.net.tr) [81.213.219.21]
F=<harkaitz@???> rejected RCPT <achanta@???>:
User account unknown
2007-10-01 08:48:49 H=(dsl.static8121321921.ttnet.net.tr) [81.213.219.21]
F=<harkaitz@???> rejected RCPT <achenes@???>:
User account unknown
2007-10-01 08:49:20 H=(dsl.static8121321921.ttnet.net.tr) [81.213.219.21]
F=<harkaitz@???> rejected RCPT <bolling@???>:
User account unknown
2007-10-01 12:00:14 SMTP connection from
(dsl.static8121321921.ttnet.net.tr) [81.213.219.21] closed after SIGTERM