Re: [exim] 2 hours delay (gnutls_handshake): timed out: deli…

Top Page
Delete this message
Reply to this message
Author: Jeremy Harris
Date:  
To: exim-users
Subject: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to
On 07/04/2022 15:16, tt-admin via Exim-users wrote:
> Here ist he complete strace of the hanging process:
>
> https://pastebin.com/wPPGab1K


31032 10:47:07 wait4(-1, 0x7fff70a35a0c, WNOHANG, NULL) = 0
31032 10:47:07 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0}) = 0 (Timeout)

This looks like 31032 is your daemon, running with a queue_interval of 60 seconds
(and with select rather than poll, you are running with likely known
bugs active on FreeBSD).


31037 10:47:07 <... recvfrom resumed> 0x55f93be5671b, 324, 0, NULL, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
31032 10:47:07 --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=32665, si_uid=0} ---
31037 10:47:07 --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=32665, si_uid=0} ---

Presumably 31037 is the process of interest. SIGUSR1 was possibly the result of
running exiwhat.

31037 10:46:58 write(9, "31037 delivering 1nZqMd-00084U-Qc to foo.bar [x.x.x.x] (foog@bar)\n", 105 <unfinished ...>
31037 10:46:58 rt_sigreturn({mask=[]} <unfinished ...>

consistent with exiwhat

31037 10:46:58 recvfrom(7, <unfinished ...>

and back to waiting in read-from-network...
We can at least discount a lack-of-entropy issue.



The call from the Exim transport to the GnuTLS library gnutls_handshake()
routine is wrapped in a an alarm() call, set by the transport option "command_timeout".
The default for that is 5 minutes (but check your config... a setting of
two hours would probably be unwise. You also mentioned that you've seen
5 minutes on other connections, presumably the same transport).

Seeing (via strace) the syscall *setting* that alarm might be interesting
(though I fear we'll see it being 300s and be no closer to a fix).
--
Cheers,
Jeremy