Re: [exim] stuck exim processes

Top Page
Delete this message
Reply to this message
Author: Martin Waschbüsch
Date:  
To: exim-users
CC: Jeremy Harris
Subject: Re: [exim] stuck exim processes
------ Original Message ------
From: "Jeremy Harris via Exim-users" <exim-users@???>
To: exim-users@???
Sent: 04.02.2022 18:35:14
Subject: Re: [exim] stuck exim processes

>On 04/02/2022 09:53, Martin Waschbüsch via Exim-users wrote:
>>this process is stuck
>
>>Any and all hints are appreciated!
>
>Strace the process. Is it doing anything? Waiting on network i/o?
>Spinning?
>

Got another message stuck this way:

root@relay01:~# ps ax | grep exim
   807  -  Ss       0:07.01 /usr/local/sbin/exim -bd -q30m
35680  -  S        0:00.01 /usr/local/sbin/exim -Mc 1nGzzC-0009HT-3F
35685  -  I        0:00.03 /usr/local/sbin/exim -Mc 1nGzzC-0009HT-3F
36493  -  I        0:00.03 /usr/local/sbin/exim -bd -q30m


I only have truss, not strace:

root@relay01:~# truss -p 35680
wait4(-1,{ STOPPED,sig=127 },WNOHANG,0x0)    = 0 (0x0)


root@relay01:~# truss -p 35685

after stopping exim and killing the processes, I restarted exim.
This time (did not happen the last time), the message got frozen:

2022-02-07 09:58:25.414 [35680] 1nGzzC-0009HT-3F Delivery status for
someone@???: got 0 of 7 bytes (pipeheader) from transport process
35685 for transport smtp
2022-02-07 09:58:25.417 [35680] 1nGzzC-0009HT-3F == someone@???
R=dnslookup T=remote_smtp defer (-1) DT=0.000s: smtp transport process
returned non-zero status 0x0009: terminated by signal 9
2022-02-07 09:58:25.417 [35680] 1nGzzC-0009HT-3F Frozen
2022-02-07 09:58:41.525 [36659] 1nGzzC-0009HT-3F Message is frozen

forcing a queue run (and thawing) by exim -qff -v resulted in:

root@relay01:~# exim -qff -v
LOG: MAIN
cwd=/root 3 args: exim -qff -v
LOG: queue_run MAIN
Start queue run: pid=36669 -qff
delivering 1nGzzC-0009HT-3F (queue run pid 36669)
LOG: MAIN
Unfrozen by forced delivery
LOG: MAIN
Completed QT=52m15s
LOG: queue_run MAIN
End queue run: pid=36669 -qff

and in the log:

2022-02-07 09:59:17.864 [36671] 1nGzzC-0009HT-3F Unfrozen by forced
delivery
2022-02-07 09:59:17.866 [36671] 1nGzzC-0009HT-3F Completed QT=52m15s

No delivery attempt in the log.

>
>What does "mailq" say (after you kill the process) - is the problem
>message still on the queue? Is that recipient still undelivered?
>
>If so, run a manual delivery with debug:
> exim -d+all -M <message-id> 2>&1 | tee my_debug_logfile


I will try the manual delivery with debug the next time around, but
maybe this is helpful already in narrowing in on what is happening.
Also: Going through the logs, it does not always occurr in combination
with a TLS error, so for now I think that part is just a coincidence.

>
>How far does it get?
>-- Cheers,
> Jeremy
>
>-- ## List details at https://lists.exim.org/mailman/listinfo/exim-users
>## Exim details at http://www.exim.org/
>## Please use the Wiki with this list - http://wiki.exim.org/