On Fri, Aug 12, 2022 at 12:31:16PM -0400, Viktor Dukhovni via Exim-users wrote:
> If the problem persists with as much as possible of the hardware assist
> disabled, then it sure looks like Linux TCP is the culprit.
Unsurprisingly, this is indeed a Linux bug. Neal Cardwell from Google
shared the below:
I strongly suspect this is a known issue with interactions between
Exim and TFO causing machines to ignore packets, which was reported
in this thread:
https://lore.kernel.org/lkml/E1nZMdl-0006nG-0J@plastiekpoot/
I tracked it down to a conntrack bug and suggested a fix, and the
conntrack maintainers checked in an expanded fix here: c7aab4f17021b
netfilter: nf_conntrack_tcp: re-init for syn packets only
https://lore.kernel.org/netdev/17c87824-7d04-c34e-bf6a-d8b874242636@tmb.nu/t/#mab1f2792ba24e98e3f41468c9781747a77c87ac9
Can you please advise folks who run into this to upgrade to Linux v5.18
or later (since it has the fix) or to cherry-pick in that fix?
I see this patch was only backported to 5.17, and not to older stable
releases. I will try to get it backported to other stable releases so
more users pick up the fix automatically from their distributions...
It seems that with TFO the Linux TCP client is prone to losing track of
the window scale, and eventually the SMTP client runs out of TCP window,
matching Jeremy's observation that the client did not get far past the
initial window.
So either get a later (or patched) kernel, or disable TFO.
--
Viktor.