[exim-dev] strange behaviour of server

Page principale
Supprimer ce message
Répondre à ce message
Auteur: Илья Антипов
Date:  
À: exim-dev
Sujet: [exim-dev] strange behaviour of server
Hello everybody.

This night server stopped responding for some time. About 3 hours. During
this period of time hdd became completely full (there was about 1Tb oа free
space) and 90 of 230 (aprox) processes became uninterruptiblle . And
suddenly space was freed and everything became stable again. Here what was
in dmesg.

[9417724.085399] INFO: task exim4:23580 blocked for more than 120 seconds.
[9417724.085403] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085406] exim4           D 0000000000000004     0 23580   6911
0x00000000
[9417724.085412]  ffff88000d9a1de8 0000000000000086 ffff8807ea54c500
0000000000000041
[9417724.085418]  ffff88000d9a1fd8 ffff88000d9a1fd8 ffff88000d9a1fd8
00000000000137c0
[9417724.085422]  ffff8807e7c22e00 ffff8807ea54c500 0000000000000000
ffff8806acde0f10
[9417724.085427] Call Trace:
[9417724.085437]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085442]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085447]  [<ffffffff81187361>] ? do_path_lookup+0x31/0xc0
[9417724.085450]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085454]  [<ffffffff8118774e>] do_unlinkat+0x8e/0x1d0
[9417724.085459]  [<ffffffff81178290>] ? vfs_write+0x110/0x180
[9417724.085462]  [<ffffffff8117855a>] ? sys_write+0x4a/0x90
[9417724.085466]  [<ffffffff811883e6>] sys_unlink+0x16/0x20
[9417724.085470]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085473] INFO: task exim4:23608 blocked for more than 120 seconds.
[9417724.085475] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085478] exim4           D ffffffff81806200     0 23608   6911
0x00000000
[9417724.085482]  ffff88006d161c78 0000000000000086 ffff8807e49f37b0
ffff88006d161e38
[9417724.085486]  ffff88006d161fd8 ffff88006d161fd8 ffff88006d161fd8
00000000000137c0
[9417724.085491]  ffffffff81c0d020 ffff880220d6ae00 0000000000000081
ffff8806acde0f10
[9417724.085495] Call Trace:
[9417724.085499]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085502]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085506]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085509]  [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085513]  [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085516]  [<ffffffff81183415>] ? putname+0x35/0x50
[9417724.085520]  [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085526]  [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085529]  [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085533]  [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085538]  [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085541]  [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085544]  [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085548]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085551] INFO: task exim4:23710 blocked for more than 120 seconds.
[9417724.085553] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085555] exim4           D ffffffff81806200     0 23710   6911
0x00000000
[9417724.085559]  ffff880005f87c78 0000000000000086 ffff8807e49f37b0
ffff880005f87e38
[9417724.085563]  ffff880005f87fd8 ffff880005f87fd8 ffff880005f87fd8
00000000000137c0
[9417724.085567]  ffff8807ea099700 ffff88000d419700 0000000000000081
ffff8806acde0f10
[9417724.085572] Call Trace:
[9417724.085576]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085579]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085582]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085586]  [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085589]  [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085593]  [<ffffffff81183415>] ? putname+0x35/0x50
[9417724.085596]  [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085601]  [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085604]  [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085607]  [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085611]  [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085614]  [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085617]  [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085621]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085624] INFO: task exim4:23741 blocked for more than 120 seconds.
[9417724.085626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085628] exim4           D ffffffff81806200     0 23741   6911
0x00000000
[9417724.085631]  ffff8800b3df1c78 0000000000000082 ffff8807e49f37b0
ffff8800b3df1e38
[9417724.085636]  ffff8800b3df1fd8 ffff8800b3df1fd8 ffff8800b3df1fd8
00000000000137c0
[9417724.085640]  ffff8807ea5e9700 ffff8806d37c4500 0000000000000081
ffff8806acde0f10
[9417724.085644] Call Trace:
[9417724.085648]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085651]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085655]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085658]  [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085662]  [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085665]  [<ffffffff81183415>] ? putname+0x35/0x50
[9417724.085669]  [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085673]  [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085676]  [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085680]  [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085683]  [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085686]  [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085690]  [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085693]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085696] INFO: task exim4:23742 blocked for more than 120 seconds.
[9417724.085698] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085700] exim4           D ffffffff81806200     0 23742   6911
0x00000000
[9417724.085704]  ffff880010d0bc78 0000000000000086 ffff8807e49f37b0
ffff880010d0be38
[9417724.085708]  ffff880010d0bfd8 ffff880010d0bfd8 ffff880010d0bfd8
00000000000137c0
[9417724.085712]  ffff8807ea001700 ffff88000d41dc00 0000000000000081
ffff8806acde0f10
[9417724.085716] Call Trace:
[9417724.085720]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085724]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085727]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085730]  [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085734]  [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085738]  [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085742]  [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085745]  [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085749]  [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085752]  [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085755]  [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085759]  [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085762]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085765] INFO: task exim4:23744 blocked for more than 120 seconds.
[9417724.085766] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085769] exim4           D ffffffff81806200     0 23744   6911
0x00000000
[9417724.085772]  ffff88006d14fc78 0000000000000082 ffff8807e49f37b0
ffff88006d14fe38
[9417724.085777]  ffff88006d14ffd8 ffff88006d14ffd8 ffff88006d14ffd8
00000000000137c0
[9417724.085781]  ffff8807ea5e9700 ffff880010d01700 0000000000000081
ffff8806acde0f10
[9417724.085785] Call Trace:
[9417724.085789]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085792]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085796]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085799]  [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085803]  [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085806]  [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085811]  [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085814]  [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085817]  [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085821]  [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085824]  [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085827]  [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085831]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9418083.051777] INFO: task exim4:23744 blocked for more than 120 seconds.
[9418083.051782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9418083.051785] exim4           D 0000000000000005     0 23744   6911
0x00000000
[9418083.051790]  ffff88006d14fde8 0000000000000082 ffff880010d01700
0000000000000041
[9418083.051796]  ffff88006d14ffd8 ffff88006d14ffd8 ffff88006d14ffd8
00000000000137c0
[9418083.051801]  ffff8802799eae00 ffff880010d01700 0000000000000000
ffff8806acde0f10
[9418083.051805] Call Trace:
[9418083.051815]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9418083.051821]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9418083.051826]  [<ffffffff81187361>] ? do_path_lookup+0x31/0xc0
[9418083.051839]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9418083.051841]  [<ffffffff8118774e>] do_unlinkat+0x8e/0x1d0
[9418083.051843]  [<ffffffff81178290>] ? vfs_write+0x110/0x180
[9418083.051845]  [<ffffffff8117855a>] ? sys_write+0x4a/0x90
[9418083.051846]  [<ffffffff811883e6>] sys_unlink+0x16/0x20
[9418083.051848]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9418083.051850] INFO: task exim4:23949 blocked for more than 120 seconds.
[9418083.051851] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9418083.051852] exim4           D 0000000000000004     0 23949   6911
0x00000000
[9418083.051853]  ffff88000d95bde8 0000000000000082 ffff880026654500
0000000000000041
[9418083.051855]  ffff88000d95bfd8 ffff88000d95bfd8 ffff88000d95bfd8
00000000000137c0
[9418083.051857]  ffff8807ea398000 ffff880026654500 0000000000000000
ffff8806acde0f10
[9418083.051859] Call Trace:
[9418083.051861]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9418083.051862]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9418083.051864]  [<ffffffff81187361>] ? do_path_lookup+0x31/0xc0
[9418083.051865]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9418083.051866]  [<ffffffff8118774e>] do_unlinkat+0x8e/0x1d0
[9418083.051868]  [<ffffffff81178290>] ? vfs_write+0x110/0x180
[9418083.051869]  [<ffffffff8117855a>] ? sys_write+0x4a/0x90
[9418083.051871]  [<ffffffff811883e6>] sys_unlink+0x16/0x20
[9418083.051872]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9418083.051874] INFO: task exim4:24110 blocked for more than 120 seconds.
[9418083.051875] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9418083.051875] exim4           D 0000000000000006     0 24110   6911
0x00000000
[9418083.051877]  ffff8800222dbe38 0000000000000082 ffff8800222dbf68
ffffffff8118875d
[9418083.051879]  ffff8800222dbfd8 ffff8800222dbfd8 ffff8800222dbfd8
00000000000137c0
[9418083.051881]  ffff8807e5ce4500 ffff880218a21700 0000000000000003
ffff8806acde0f10
[9418083.051882] Call Trace:
[9418083.051884]  [<ffffffff8118875d>] ? sys_renameat+0xbd/0x240
[9418083.051886]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9418083.051887]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9418083.051889]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9418083.051892]  [<ffffffff8121194f>] ext4_sync_file+0xef/0x2d0
[9418083.051893]  [<ffffffff811775bc>] ? do_sys_open+0x17c/0x240
[9418083.051896]  [<ffffffff811a6ae6>] do_fsync+0x56/0x80
[9418083.051898]  [<ffffffff811a6e10>] sys_fsync+0x10/0x20
[9418083.051899]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9418083.051901] INFO: task exim4:24185 blocked for more than 120 seconds.
[9418083.051902] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9418083.051903] exim4           D 0000000000000000     0 24185      1
0x00000000
[9418083.051905]  ffff8803242bfde8 0000000000000082 ffff8803242bfe38
ffffffff81646dda
[9418083.051907]  ffff8803242bffd8 ffff8803242bffd8 ffff8803242bffd8
00000000000137c0
[9418083.051908]  ffff8807ea3a9700 ffff88000dd89700 0000000000000000
ffff8806acde0f10
[9418083.051910] Call Trace:
[9418083.051913]  [<ffffffff81646dda>] ? __slab_free+0x1d2/0x2f3
[9418083.051915]  [<ffffffff816590ff>] schedule+0x3f/0x60
[9418083.051916]  [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9418083.051918]  [<ffffffff81187361>] ? do_path_lookup+0x31/0xc0
[9418083.051919]  [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9418083.051921]  [<ffffffff8118774e>] do_unlinkat+0x8e/0x1d0
[9418083.051924]  [<ffffffff81014721>] ? math_state_restore+0x51/0x80
[9418083.051925]  [<ffffffff811883e6>] sys_unlink+0x16/0x20
[9418083.051927]  [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9429960.233032] Process accounting paused
[9430317.176019] Process accounting resumed


I'n not sure i't exim to blame, but "exim" seems to be the only word I know
in this list:)

Exim version 4.76 #1 built 25-Oct-2012 14:23:36
Copyright (c) University of Cambridge, 1995 - 2007
Berkeley DB: Berkeley DB 5.1.25: (January 28, 2011)
Support for: crypteq iconv() IPv6 GnuTLS move_frozen_messages DKIM
Lookups (built-in): lsearch wildlsearch nwildlsearch iplsearch cdb dbm
dbmnz dnsdb dsearch nis nis0 passwd
Authenticators: cram_md5 plaintext
Routers: accept dnslookup ipliteral manualroute queryprogram redirect
Transports: appendfile/maildir/mailstore autoreply lmtp pipe smtp
Fixed never_users: 0
Size of off_t: 8

Actually i't used only for delivery of our local mail - less then 1
mail/sec. Can anyone say what happened? Thank you.
--
С уважением,
Антипов Илья