[Exim] Timeouts during outbound SMTP data transfer

Top Page
Delete this message
Reply to this message
Author: Ian Jackson
Date:  
To: exim-users
Subject: [Exim] Timeouts during outbound SMTP data transfer
My system has recently acquired a mailing list previously hosted
elsewhere, and this list has some addresses which are causing problems
for Exim.

Symptoms with -d9 are (identifying information removed):

Connecting to MX.EXAMPLE.COM [127.0.0.3] ... connected
SMTP<< 220 MX.EXAMPLE.COM ESMTP Sendmail 8.9.1/8.9.1; Wed, 18 Oct 2000 14:35:08 -0400
SMTP>> EHLO chiark.greenend.org.uk

  SMTP<< 250-MX.EXAMPLE.COM Hello mail@??? [195.224.76.132], pleased to meet you
         250-8BITMIME
         250-SIZE
         250-DSN
         250-ONEX
         250-ETRN
         250-XUSR
         250 HELP

SMTP>> MAIL FROM:<ukcrypto-admin@???> SIZE=3763

SMTP<< 250 <ukcrypto-admin@???>... Sender ok
SMTP>> RCPT TO:<USER@???>

SMTP<< 250 <USER@???>... Recipient ok
SMTP>> DATA

SMTP<< 354 Enter mail, end with "." on a line by itself
SMTP>> writing message and terminating "."

writing data block size=2743 timeout=300
ok=0 send_quit=0 send_rset=1 continue_more=0 yield=0 first_address=0
set_process_info: 28800 3.12 delivering 13kXCQ-0002Cv-00: just tried
MX.EXAMPLE.COM [127.0.0.3] for USER@???:
result OK
added retry item for
T:MX.EXAMPLE.COM:127.0.0.3:13kXCQ-0002Cv-00: errno=110
77 flags=6
all IP addresses skipped or deferred at least one address
locked /var/spool/exim/db/wait-smtp.lockfile
opened DB file /var/spool/exim/db/wait-smtp: flags=42
Leaving smtp transport
set_process_info: 28800 3.12 delivering 13kXCQ-0002Cv-00 (just run
smtp for USER@???)
post-process USER@???
LOG: 0 MAIN
== USER@??? T=smtp defer (110): Connection timed out: SMTP
timeout while connected to MX.EXAMPLE.COM [127.0.0.3]
after end of data (8229 bytes written)

Looking in tcpdump I see this:

(about 20 packets not shown, where the initial setup and TCP exchanges happen)
19:14:10.144906 195.224.76.132.2783 > 127.0.0.2.25: P 121:127(6) ack 354 win 32120 <nop,nop,timestamp 138650266 3028812662> (DF) (ttl 64, id 56405)
19:14:10.394868 127.0.0.2.25 > 195.224.76.132.2783: P 354:404(50) ack 127 win 32120 <nop,nop,timestamp 3028812687 138650266> (DF) (ttl 49, id 21446)
19:14:10.407112 195.224.76.132.2783 > 127.0.0.2.25: . ack 404 win 32120 <nop,nop,timestamp 138650293 3028812687> (DF) (ttl 64, id 56439)
19:14:10.410161 195.224.76.132.2783 > 127.0.0.2.25: P 127:1575(1448) ack 404 win 32120 <nop,nop,timestamp 138650293 3028812687> (DF) (ttl 64, id 56440)
19:14:10.412288 195.224.76.132.2783 > 127.0.0.2.25: P 1575:2870(1295) ack 404 win 32120 <nop,nop,timestamp 138650293 3028812687> (DF) (ttl 64, id 56441)
19:14:11.377070 195.224.76.132.2783 > 127.0.0.2.25: P 127:1575(1448) ack 404 win 32120 <nop,nop,timestamp 138650390 3028812687> (DF) (ttl 64, id 56504)
(this continues for some time like that)

It looks to me like the receiving site has a problem with large
packets. Perhaps their firewall is blocking outbound ICMP
fragmentation needed packets.

The reason I'm mailing this list is to ask whether this kind of thing
is at all common. I've not seen it on this scale before; perhaps
ukcrypto has a larger proportion of people with overly-fascist
firewalls or something.

If it's common, what do people do about it ? I'm not sure I have the
effort to chase these people up, but surely they must have noticed
that many sites can't send them large mails ?

If my guess is wrong, how would I tell and what should I do ?

Thanks,
Ian.