On Thu Jan 27 2005 at 15:30:34 CET, Tony Finch wrote:
> The problem occurs earlier than that, because the sender never sees the
> response to CRLF.CRLF and aborts at that point, but the recipient thinks
> the sender received it and said QUIT despite the sender thinking
> otherwise. Definitely firewall protocol fux-up.
It doesn't appear to be a protocol "fux-up" as you call it ;-) We've traced
the outside (i.e. in this case sending side) and the inside (i.e. receiving
Exim). The Checkpoint FW1 is not "fixing" anything. The packet flow is
gatem (outside) ====> m1 (inside)
-----------------------------------------------------------
HELO
SMTP body
...
SMTP Message body [2] > SMTP Message
TCP ACK < TCP ACK
TCP ACK (for previous packets)
TCP ACK ""
TCP ACK ""
TCP ACK ""
TCP FIN,ACK < TCP FIN,ACK
TCP FIN > TCP FIN,ACK
< TCP ACK
TCP ACK
The last body packet sent direction inside actually doesn have the "\r\n.\r\n"
in it, which is correctly read by m1 on the inside. We don't see the 220 code
returning (neither to outside, nor from "inside"). This leads me to believe,
that the Exim process on "inside" has simply died off. Unfortunately, when
I try to re-run delivery on the same message-id by running the "inside" Exim
with `-d+all', the message is correctly processed, so I can't prove Exim's
original possible death.
What leads me to believe that Exim really has died is that I see the corresponding
files for the message-id in the spool/input directory. That is when they are subsequently
delivered by a normal queue runner process later, resulting in multiple deliveries.
During this time, the original message is still queued for delivery on "outside", and
sometimes is correctly delivered after one, two or more hours; again, this leads me
to believe in Exim's death.
It appears then, that it is a *content* problem. I have exiscan-acl configured, but
disabling that entirely doesn't solve the problem.
Any more ideas gentlemen?
-JP