[exim-dev] Cutthrough routing: data loss on client side erro…

Top Page
Delete this message
Reply to this message
Author: Heiko Schlittermann
Date:  
To: exim-dev
Subject: [exim-dev] Cutthrough routing: data loss on client side error
Hello,

we've some finding about cutthrough routing…
It seems we loose parts of the messages.


Our setup:

{ internet } --> host HH --> cutthrough --> host MX
exim 4.86_8-c4dcf90 (4.86+fixes) on HH

Sometimes the delivery size on host HH from the log
is *bigger* then the delivery size on host MX. Customers
reported cut attachments.

msg-id ----------------------------------------+ size HH + size MX
...@eu-west-1.amazonses.com                      69739     9088
...@captrain.de                                 889136    15803
...@www.travelbeyond.de                          28680     3648
...@webprd-m107.mail.aol.com                     44967     9360
...@SBS2011.zeller-france.local                 918753   903892
...@amsona-coaching.de                          283006     2392
...@Paraguay.intranet.bonapartehotel.it          58941     5933
...@cmail20.com                                  98329     4446
...@amsona-coaching.de                          283694     2408
...@hajmxmdb03.msrf.insite                      281470     3474
...@news.lidl.de                                142437     6759
...@VI1PR02MB1392.eurprd02.prod.outlook.com       4499     4277


It seems(!) that there is some correlation to SSL problems of:

On host HH

    2016-01-18 20:06:49 [10762] 1aLF8a-0002na-5D DKIM: d=amazon.de s=fraqczefv5ucvzwxuiuhta623hvupqlf c=relaxed/simple a=rsa-sha256 t=1453143470 [verification succeeded]
    2016-01-18 20:06:49 [10762] 1aLF8a-0002na-5D DKIM: d=amazonses.com s=uku4taia5b5tsbglxyj6zym32efj7xqv c=relaxed/simple a=rsa-sha256 t=1453143470 [verification succeeded]
    2016-01-18 20:06:49 [10762] 1aLF8a-0002na-5D Header modification in data ACLs will not take effect on cutthrough deliveries
    2016-01-18 20:06:51 [10762] 1aLF8a-0002na-5D Header modification in data ACLs will not take effect on cutthrough deliveries
E   2016-01-18 20:06:51 [10762] 1aLF8a-0002na-5D SSL_write: (from mailout01.mymxserver.com [185.15.192.32]) syscall: Connection reset by peer
E   2016-01-18 20:06:51 [10762] 1aLF8a-0002na-5D SSL_write error 5
    2016-01-18 20:06:51 [10762] 1aLF8a-0002na-5D >> {zensored} R=dnslookup T=remote_smtp H=ssl.schlittermann.de [212.80.235.130] X=TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256 CV=no DN="/C=DE/ST=Sachsen/L=Dresden/O=Heiko Schlittermann/CN=ssl.schlittermann.de/emailAddress=postmaster@???" C="250 OK id=1aLF8W-0002wH-6N"
    2016-01-18 20:06:51 [10762] 1aLF8a-0002na-5D TLS error (SSL_write): error:1409F07F:SSL routines:SSL3_WRITE_PENDING:bad write retry
S   2016-01-18 20:06:51 [10762] 1aLF8a-0002na-5D <= {zensored} H=mailout01.mymxserver.com [185.15.192.32] P=esmtps X=TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256 CV=no S=69739 id={zensored} T="Ihre Amazon.de Bestellung von \"Was wir sind und was wir...\" wurde\n versandt!"
    2016-01-18 20:06:51 [10762] 1aLF8a-0002na-5D Completed


On host MX (yes, poorly syncronized clocks)

    2016-01-18 20:06:44 [11301] 1aLF8W-0002wH-6N DKIM: d=amazon.de s=fraqczefv5ucvzwxuiuhta623hvupqlf c=relaxed/simple a=rsa-sha256 t=1453143470 [verification failed - body hash mismatch (body probably modified in transit)]
    2016-01-18 20:06:44 [11301] 1aLF8W-0002wH-6N DKIM: d=amazonses.com s=uku4taia5b5tsbglxyj6zym32efj7xqv c=relaxed/simple a=rsa-sha256 t=1453143470 [verification failed - body hash mismatch (body probably modified in transit)]
    2016-01-18 20:06:44 [11301] 1aLF8W-0002wH-6N H=hh.schlittermann.de [213.128.132.49] I=[84.19.194.2]:25 Warning: message size (-1 / 7986)
S  2016-01-18 20:06:46 [11301] 1aLF8W-0002wH-6N <= {zenzored} H=hh.schlittermann.de [213.128.132.49] I=[84.19.194.2]:25 P=esmtps X=TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256 CV=no S=9088 id=000001525619b204-43d5a4cb-3142-4ce8-a9b6-7517370027c7-000000@??? T="Ihre Amazon.de Bestellung von \"Was wir sind und was wir...\" wurde\n versandt!"
    2016-01-18 20:06:46 [11313] 1aLF8W-0002wH-6N [84.19.194.3] SSL verify error: certificate name mismatch: "/C=DE/ST=Sachsen/L=Dresden/O=Heiko Schlittermann/CN=ssl.schlittermann.de/emailAddress=postmaster@???"
    2016-01-18 20:06:46 [11311] 1aLF8W-0002wH-6N => {zenzored} I=[84.19.194.2] F=<{zenzored} R=smart T=smtp H=pop.net.schlittermann.de [84.19.194.3] X=TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256 CV=no C="250 OK id=1aLF8Y-0001tz-8i" QT=2s DT=0s
    2016-01-18 20:06:46 [11311] 1aLF8W-0002wH-6N Completed QT=2s


Of the S=69739 bytes (delivery size) the other end of the cutthrough
routing reports 9088 bytes delivery size and a completly fine smtp
transaction (no anomalies in the log).

What's going on here?

I *assume*, the SSL_write error happens after sending the '250 OK' to
the (client) downstream, during connection shutdown. The client just
cuts the connection w/o proper SSL connection shutdown. Parts of Exim
treat this as a problem and stop the cutthrough delivery. But it
doesn't just stop it, it seems to finish it properly, sending the final
dot, because otherwise the upstream server (Exim too) should log an
unexpected connection loss. And afterwards the
forwarding/cuttrough-routing server counts the transaction as completed
too, removing all local files.

    Best regards from Dresden/Germany
    Viele Grüße aus Dresden
    Heiko Schlittermann
-- 
 SCHLITTERMANN.de ---------------------------- internet & unix support -
 Heiko Schlittermann, Dipl.-Ing. (TU) - {fon,fax}: +49.351.802998{1,3} -
 gnupg encrypted messages are welcome --------------- key ID: F69376CE -
 ! key id 7CBF764A and 972EAC9F are revoked since 2015-01 ------------ -