Re: [exim] TLS error in incoming emails from *.outlook.com

Top Page
Delete this message
Reply to this message
Author: Viktor Dukhovni
Date:  
To: exim-users
Subject: Re: [exim] TLS error in incoming emails from *.outlook.com


> On Feb 12, 2018, at 8:21 PM, Andreas Bauer via Exim-users <exim-users@???> wrote:
>
>     504 540.259940     40.92.67.82           <EXIM4_IP>          TCP      66     45792 → 25 [SYN, ECN, CWR] Seq=0 Win=8192 Len=0 MSS=1460 WS=256 SACK_PERM=1
>     505 540.259967     <EXIM4_IP>          40.92.67.82           TCP      66     25 → 45792 [SYN, ACK] Seq=0 Ack=1 Win=29200 Len=0 MSS=1460 SACK_PERM=1 WS=128


You're negotiating ECN, Selective ACKS, and window scaling, with an MSS of 1460 in both directions.

>     518 540.839177     40.92.67.82           <EXIM4_IP>          TCP      2974   45792 → 25 [ACK] Seq=139 Ack=307 Win=65280 Len=2920
>     519 540.839183     40.92.67.82           <EXIM4_IP>          TCP      2974   45792 → 25 [ACK] Seq=3059 Ack=307 Win=65280 Len=2920


And yet you see network packets with an alleged length of 2920 (twice MSS) bytes, which
implies that your network card is doing TCP offload, and consolidating TCP segments.

>     520 540.839198     <EXIM4_IP>          40.92.67.82           TCP      54     25 → 45792 [ACK] Seq=307 Ack=3059 Win=35072 Len=0
>     521 540.839205     <EXIM4_IP>          40.92.67.82           TCP      54     25 → 45792 [ACK] Seq=307 Ack=5979 Win=40960 Len=0


You ACK both segments,

>     530 541.132235     40.92.67.82           <EXIM4_IP>          TCP      1514   [TCP Spurious Retransmission] 45792 → 25 [ACK] Seq=139 Ack=307 Win=65280 Len=1460


But one second later (a longish pause) the remote client retransmits just the initial
segment, so it received neither ACK. TCP retransmission is a low-level kernel feature
and has little to do with "server misconfiguration". Something at layer 4 or below
is dropping your ACKs for the consolidated segments.

>     531 541.132256     <EXIM4_IP>          40.92.67.82           TCP      66     [TCP Dup ACK 521#1] 25 → 45792 [ACK] Seq=307 Ack=5979 Win=40960 Len=0 SLE=139 SRE=1599


However your duplicate ACK appears to get through as transmission resumes:

>     532 541.141807     40.92.67.82           <EXIM4_IP>          TCP      4434   45792 → 25 [ACK] Seq=5979 Ack=307 Win=65280 Len=4380


This time with three consolidated segments.

>     533 541.141814     40.92.67.82           <EXIM4_IP>          TCP      2974   45792 → 25 [PSH, ACK] Seq=10359 Ack=307 Win=65280 Len=2920


And then another two.

>     534 541.141828     <EXIM4_IP>          40.92.67.82           TCP      54     25 → 45792 [ACK] Seq=307 Ack=10359 Win=49664 Len=0
>     535 541.141845     <EXIM4_IP>          40.92.67.82           TCP      54     25 → 45792 [ACK] Seq=307 Ack=13279 Win=55552 Len=0


Again these ACKs are lost...

>     536 542.054064     40.92.67.82           <EXIM4_IP>          TCP      1514   [TCP Spurious Retransmission] 45792 → 25 [ACK] Seq=5979 Ack=307 Win=65280 Len=1460


And the client tries again, this time only a single probe segment

>     537 542.054137     <EXIM4_IP>          40.92.67.82           TCP      66     [TCP Dup ACK 535#1] 25 → 45792 [ACK] Seq=307 Ack=13279 Win=55552 Len=0 SLE=5979 SRE=7439


Which elicits an ACK for all the outstanding data, and this ACK gets through:

>     538 542.063382     40.92.67.82           <EXIM4_IP>          TCP      7354   45792 → 25 [ACK] Seq=13279 Ack=307 Win=65280 Len=7300


And now you see 5 consolidated segments...

>     539 542.063419     <EXIM4_IP>          40.92.67.82           TCP      54     25 → 45792 [ACK] Seq=307 Ack=20579 Win=70144 Len=0


But your ACK is lost.

>     540 544.772896     40.92.67.82           <EXIM4_IP>          TCP      1514   [TCP Spurious Retransmission] 45792 → 25 [ACK] Seq=13279 Ack=307 Win=65280 Len=1460


Once again a single-segment retransmission probe.

>     541 544.772932     <EXIM4_IP>          40.92.67.82           TCP      66     [TCP Dup ACK 539#1] 25 → 45792 [ACK] Seq=307 Ack=20579 Win=70144 Len=0 SLE=13279 SRE=14739


And the dup ACK makes it:

>     542 544.782341     40.92.67.82           <EXIM4_IP>          TCP      1514   45792 → 25 [ACK] Seq=20579 Ack=307 Win=65280 Len=1460


So the client again resumes transmission, delivering a normal segment.

>     543 544.782360     <EXIM4_IP>          40.92.67.82           TCP      54     25 → 45792 [ACK] Seq=307 Ack=22039 Win=73088 Len=0


Whose ACK gets through.

>     544 544.782442     40.92.67.82           <EXIM4_IP>          TCP      1514   45792 → 25 [ACK] Seq=22039 Ack=307 Win=65280 Len=1460
>     545 544.782447     <EXIM4_IP>          40.92.67.82           TCP      54     25 → 45792 [ACK] Seq=307 Ack=23499 Win=76032 Len=0


Ditto.

>     546 544.782493     40.92.67.82           <EXIM4_IP>          TCP      4434   45792 → 25 [PSH, ACK] Seq=23499 Ack=307 Win=65280 Len=4380


But now a 3-MSS segment (from the TCP-offload NIC.

>     547 544.782495     <EXIM4_IP>          40.92.67.82           TCP      54     25 → 45792 [ACK] Seq=307 Ack=27879 Win=84736 Len=0


And again your ACK is lost... Etc.

> I have no clue what is happening there. One can see a correct SMTP dialog, and then a message follows with a base64 attachment. Somewhere in that transmission, it just stops. Also interesting, the timeline.


You have buggy TCP-offload, and/or a buggy firewall, possibly aggravated by TCP window scaling...
Most likely Microsoft is just fine. So disable TCP-offload in your network card and possibly
window scaling in your kernel if disabling offload is not enough.

-- 
    Viktor.