Hello everybody.
This night server stopped responding for some time. About 3 hours. During
this period of time hdd became completely full (there was about 1Tb oа free
space) and 90 of 230 (aprox) processes became uninterruptiblle . And
suddenly space was freed and everything became stable again. Here what was
in dmesg.
[9417724.085399] INFO: task exim4:23580 blocked for more than 120 seconds.
[9417724.085403] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085406] exim4 D 0000000000000004 0 23580 6911
0x00000000
[9417724.085412] ffff88000d9a1de8 0000000000000086 ffff8807ea54c500
0000000000000041
[9417724.085418] ffff88000d9a1fd8 ffff88000d9a1fd8 ffff88000d9a1fd8
00000000000137c0
[9417724.085422] ffff8807e7c22e00 ffff8807ea54c500 0000000000000000
ffff8806acde0f10
[9417724.085427] Call Trace:
[9417724.085437] [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085442] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085447] [<ffffffff81187361>] ? do_path_lookup+0x31/0xc0
[9417724.085450] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085454] [<ffffffff8118774e>] do_unlinkat+0x8e/0x1d0
[9417724.085459] [<ffffffff81178290>] ? vfs_write+0x110/0x180
[9417724.085462] [<ffffffff8117855a>] ? sys_write+0x4a/0x90
[9417724.085466] [<ffffffff811883e6>] sys_unlink+0x16/0x20
[9417724.085470] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085473] INFO: task exim4:23608 blocked for more than 120 seconds.
[9417724.085475] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085478] exim4 D ffffffff81806200 0 23608 6911
0x00000000
[9417724.085482] ffff88006d161c78 0000000000000086 ffff8807e49f37b0
ffff88006d161e38
[9417724.085486] ffff88006d161fd8 ffff88006d161fd8 ffff88006d161fd8
00000000000137c0
[9417724.085491] ffffffff81c0d020 ffff880220d6ae00 0000000000000081
ffff8806acde0f10
[9417724.085495] Call Trace:
[9417724.085499] [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085502] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085506] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085509] [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085513] [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085516] [<ffffffff81183415>] ? putname+0x35/0x50
[9417724.085520] [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085526] [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085529] [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085533] [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085538] [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085541] [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085544] [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085548] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085551] INFO: task exim4:23710 blocked for more than 120 seconds.
[9417724.085553] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085555] exim4 D ffffffff81806200 0 23710 6911
0x00000000
[9417724.085559] ffff880005f87c78 0000000000000086 ffff8807e49f37b0
ffff880005f87e38
[9417724.085563] ffff880005f87fd8 ffff880005f87fd8 ffff880005f87fd8
00000000000137c0
[9417724.085567] ffff8807ea099700 ffff88000d419700 0000000000000081
ffff8806acde0f10
[9417724.085572] Call Trace:
[9417724.085576] [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085579] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085582] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085586] [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085589] [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085593] [<ffffffff81183415>] ? putname+0x35/0x50
[9417724.085596] [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085601] [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085604] [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085607] [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085611] [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085614] [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085617] [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085621] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085624] INFO: task exim4:23741 blocked for more than 120 seconds.
[9417724.085626] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085628] exim4 D ffffffff81806200 0 23741 6911
0x00000000
[9417724.085631] ffff8800b3df1c78 0000000000000082 ffff8807e49f37b0
ffff8800b3df1e38
[9417724.085636] ffff8800b3df1fd8 ffff8800b3df1fd8 ffff8800b3df1fd8
00000000000137c0
[9417724.085640] ffff8807ea5e9700 ffff8806d37c4500 0000000000000081
ffff8806acde0f10
[9417724.085644] Call Trace:
[9417724.085648] [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085651] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085655] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085658] [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085662] [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085665] [<ffffffff81183415>] ? putname+0x35/0x50
[9417724.085669] [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085673] [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085676] [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085680] [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085683] [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085686] [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085690] [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085693] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085696] INFO: task exim4:23742 blocked for more than 120 seconds.
[9417724.085698] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085700] exim4 D ffffffff81806200 0 23742 6911
0x00000000
[9417724.085704] ffff880010d0bc78 0000000000000086 ffff8807e49f37b0
ffff880010d0be38
[9417724.085708] ffff880010d0bfd8 ffff880010d0bfd8 ffff880010d0bfd8
00000000000137c0
[9417724.085712] ffff8807ea001700 ffff88000d41dc00 0000000000000081
ffff8806acde0f10
[9417724.085716] Call Trace:
[9417724.085720] [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085724] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085727] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085730] [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085734] [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085738] [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085742] [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085745] [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085749] [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085752] [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085755] [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085759] [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085762] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9417724.085765] INFO: task exim4:23744 blocked for more than 120 seconds.
[9417724.085766] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9417724.085769] exim4 D ffffffff81806200 0 23744 6911
0x00000000
[9417724.085772] ffff88006d14fc78 0000000000000082 ffff8807e49f37b0
ffff88006d14fe38
[9417724.085777] ffff88006d14ffd8 ffff88006d14ffd8 ffff88006d14ffd8
00000000000137c0
[9417724.085781] ffff8807ea5e9700 ffff880010d01700 0000000000000081
ffff8806acde0f10
[9417724.085785] Call Trace:
[9417724.085789] [<ffffffff816590ff>] schedule+0x3f/0x60
[9417724.085792] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9417724.085796] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9417724.085799] [<ffffffff811862b4>] do_last+0x2b4/0x730
[9417724.085803] [<ffffffff81187ad1>] path_openat+0xd1/0x3f0
[9417724.085806] [<ffffffff81187f12>] do_filp_open+0x42/0xa0
[9417724.085811] [<ffffffff81319811>] ? strncpy_from_user+0x31/0x40
[9417724.085814] [<ffffffff8118325a>] ? do_getname+0x10a/0x180
[9417724.085817] [<ffffffff8165b00e>] ? _raw_spin_lock+0xe/0x20
[9417724.085821] [<ffffffff81195227>] ? alloc_fd+0xf7/0x150
[9417724.085824] [<ffffffff81177538>] do_sys_open+0xf8/0x240
[9417724.085827] [<ffffffff811776a0>] sys_open+0x20/0x30
[9417724.085831] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9418083.051777] INFO: task exim4:23744 blocked for more than 120 seconds.
[9418083.051782] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9418083.051785] exim4 D 0000000000000005 0 23744 6911
0x00000000
[9418083.051790] ffff88006d14fde8 0000000000000082 ffff880010d01700
0000000000000041
[9418083.051796] ffff88006d14ffd8 ffff88006d14ffd8 ffff88006d14ffd8
00000000000137c0
[9418083.051801] ffff8802799eae00 ffff880010d01700 0000000000000000
ffff8806acde0f10
[9418083.051805] Call Trace:
[9418083.051815] [<ffffffff816590ff>] schedule+0x3f/0x60
[9418083.051821] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9418083.051826] [<ffffffff81187361>] ? do_path_lookup+0x31/0xc0
[9418083.051839] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9418083.051841] [<ffffffff8118774e>] do_unlinkat+0x8e/0x1d0
[9418083.051843] [<ffffffff81178290>] ? vfs_write+0x110/0x180
[9418083.051845] [<ffffffff8117855a>] ? sys_write+0x4a/0x90
[9418083.051846] [<ffffffff811883e6>] sys_unlink+0x16/0x20
[9418083.051848] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9418083.051850] INFO: task exim4:23949 blocked for more than 120 seconds.
[9418083.051851] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9418083.051852] exim4 D 0000000000000004 0 23949 6911
0x00000000
[9418083.051853] ffff88000d95bde8 0000000000000082 ffff880026654500
0000000000000041
[9418083.051855] ffff88000d95bfd8 ffff88000d95bfd8 ffff88000d95bfd8
00000000000137c0
[9418083.051857] ffff8807ea398000 ffff880026654500 0000000000000000
ffff8806acde0f10
[9418083.051859] Call Trace:
[9418083.051861] [<ffffffff816590ff>] schedule+0x3f/0x60
[9418083.051862] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9418083.051864] [<ffffffff81187361>] ? do_path_lookup+0x31/0xc0
[9418083.051865] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9418083.051866] [<ffffffff8118774e>] do_unlinkat+0x8e/0x1d0
[9418083.051868] [<ffffffff81178290>] ? vfs_write+0x110/0x180
[9418083.051869] [<ffffffff8117855a>] ? sys_write+0x4a/0x90
[9418083.051871] [<ffffffff811883e6>] sys_unlink+0x16/0x20
[9418083.051872] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9418083.051874] INFO: task exim4:24110 blocked for more than 120 seconds.
[9418083.051875] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9418083.051875] exim4 D 0000000000000006 0 24110 6911
0x00000000
[9418083.051877] ffff8800222dbe38 0000000000000082 ffff8800222dbf68
ffffffff8118875d
[9418083.051879] ffff8800222dbfd8 ffff8800222dbfd8 ffff8800222dbfd8
00000000000137c0
[9418083.051881] ffff8807e5ce4500 ffff880218a21700 0000000000000003
ffff8806acde0f10
[9418083.051882] Call Trace:
[9418083.051884] [<ffffffff8118875d>] ? sys_renameat+0xbd/0x240
[9418083.051886] [<ffffffff816590ff>] schedule+0x3f/0x60
[9418083.051887] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9418083.051889] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9418083.051892] [<ffffffff8121194f>] ext4_sync_file+0xef/0x2d0
[9418083.051893] [<ffffffff811775bc>] ? do_sys_open+0x17c/0x240
[9418083.051896] [<ffffffff811a6ae6>] do_fsync+0x56/0x80
[9418083.051898] [<ffffffff811a6e10>] sys_fsync+0x10/0x20
[9418083.051899] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9418083.051901] INFO: task exim4:24185 blocked for more than 120 seconds.
[9418083.051902] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[9418083.051903] exim4 D 0000000000000000 0 24185 1
0x00000000
[9418083.051905] ffff8803242bfde8 0000000000000082 ffff8803242bfe38
ffffffff81646dda
[9418083.051907] ffff8803242bffd8 ffff8803242bffd8 ffff8803242bffd8
00000000000137c0
[9418083.051908] ffff8807ea3a9700 ffff88000dd89700 0000000000000000
ffff8806acde0f10
[9418083.051910] Call Trace:
[9418083.051913] [<ffffffff81646dda>] ? __slab_free+0x1d2/0x2f3
[9418083.051915] [<ffffffff816590ff>] schedule+0x3f/0x60
[9418083.051916] [<ffffffff81659f07>] __mutex_lock_slowpath+0xd7/0x150
[9418083.051918] [<ffffffff81187361>] ? do_path_lookup+0x31/0xc0
[9418083.051919] [<ffffffff81659b1a>] mutex_lock+0x2a/0x50
[9418083.051921] [<ffffffff8118774e>] do_unlinkat+0x8e/0x1d0
[9418083.051924] [<ffffffff81014721>] ? math_state_restore+0x51/0x80
[9418083.051925] [<ffffffff811883e6>] sys_unlink+0x16/0x20
[9418083.051927] [<ffffffff81663602>] system_call_fastpath+0x16/0x1b
[9429960.233032] Process accounting paused
[9430317.176019] Process accounting resumed
I'n not sure i't exim to blame, but "exim" seems to be the only word I know
in this list:)
Exim version 4.76 #1 built 25-Oct-2012 14:23:36
Copyright (c) University of Cambridge, 1995 - 2007
Berkeley DB: Berkeley DB 5.1.25: (January 28, 2011)
Support for: crypteq iconv() IPv6 GnuTLS move_frozen_messages DKIM
Lookups (built-in): lsearch wildlsearch nwildlsearch iplsearch cdb dbm
dbmnz dnsdb dsearch nis nis0 passwd
Authenticators: cram_md5 plaintext
Routers: accept dnslookup ipliteral manualroute queryprogram redirect
Transports: appendfile/maildir/mailstore autoreply lmtp pipe smtp
Fixed never_users: 0
Size of off_t: 8
Actually i't used only for delivery of our local mail - less then 1
mail/sec. Can anyone say what happened? Thank you.
--
С уважением,
Антипов Илья