On 30/12/2020 08:04, Evgeniy Berdnikov via Exim-users wrote:
> On Wed, Dec 30, 2020 at 02:25:19PM +0700, Victor Sudakov via Exim-users wrote:
>> Is this ktrace informative https://termbin.com/zjsv ?
Yes; thanks.
>
> 8889 exim CALL socket(PF_INET,0x1<SOCK_STREAM>,IPPROTO_IP)
> 8889 exim RET socket 5
> 8889 exim CALL setitimer(0,0x7fffffff30d0,0x7fffffff30b0)
> 8889 exim STRU itimerval { .interval = {0, 0}, .value = {5, 0} }
> 8889 exim STRU itimerval { .interval = {0, 0}, .value = {0, 0} }
> 8889 exim RET setitimer 0
> 8889 exim CALL setsockopt(0x5,IPPROTO_TCP,TCP_FASTOPEN,0x254270,0x4)
> 8889 exim RET setsockopt 0
> 8889 exim CALL sendto(0x5,0x2322d6,0xa,0,0x7fffffff3150,0x10)
> 8889 exim STRU struct sockaddr { AF_INET, 192.168.153.104:3310 }
> 8889 exim GIO fd 5 wrote 10 bytes
> "zINSTREAM\0"
> 8889 exim RET sendto 10/0xa
> 8889 exim CALL setitimer(0,0x7fffffff30d0,0x7fffffff30b0)
> 8889 exim STRU itimerval { .interval = {0, 0}, .value = {0, 0} }
> 8889 exim STRU itimerval { .interval = {0, 0}, .value = {4, 999970} }
> 8889 exim RET setitimer 0
> 8889 exim CALL close(0xffffffff)
> 8889 exim RET close -1 errno 9 Bad file descriptor
>
> As packet is sent, it may be some problem with TCP_FASTOPEN, probably
> with its handling in hypervisor and/or external firewall.
Kernel, I think. The packet capture showed the SYNs not carrying
any TFO request, despite that TCP_FASTOPEN setsockopt. Probably the
FreeBSD implementation has changed since I worked on the Exim
implementation, in such a way as to break the combination.
In case it helps, my notes from then include:
# FreeBSD: it looks like you have to compile a custom kernel, with
# 'options TCP_RFC7413' in the config. Also set
# 'net.inet.tcp.fastopen.server_enable=1' in /etc/sysctl.conf
If there's a sysctl for enabling the client side, try changing
it. If that affects this, we need to know.
I could try to code up a backstop, retrying the connection without
TFO... sigh. Without some effort it wouldn't be particularly efficient
in service operation, and I call having to do that "ugly". Better
to get it working correctly, or deciding before trying that the
feature is not usable on the platform.
Slightly less ugly would be a config option "no TFO", either global
or just on the av_scanner address.
Meantime, the Exim debug channels "acl" and "transport" would show
the sequence from a higher-level view. We might be able to guess
what that close(-1) was (yes, that's a bug. Not an important one).
A compile-time workaround would be to disable TFO support, by commenting
out the line "#define EXIM_SUPPORT_TFO" in src/ip.c
>
> Consequent close(-1) is definitely an error, but let us ignore it now.
> Then exim reads file and tries to write into this socket:
>
> 8889 exim CALL sendto(0x5,0x7fffffff3344,0x4,0,0,0)
That write will be the filesize, given the zINSTREAM protocol
being used. The coding is assuming that it will either block until
the TCP connection has been made (and the TFO data either sent
as part of that or immediately following), or queue the data for
when it is made. Either way, it's not expecting an error return.
> 8889 exim RET sendto -1 errno 57 Socket is not connected
The error would be reasonable if we'd not tried to connect - which
is what the first sendto() (under TCP_FASTOPEN) is supposed to do.
--
Cheers,
Jeremy