Sheldon Hearn wrote:
> Hang on, you're misunderstanding the results you see. You're looking at
> the message "in transit", where it's governed by network transport
> rules. Those extra CRs you're seeing are removed at the other end of
> the transport pipe, as mandated by the standards.
I think this section from RFC 2822 pretty clearly states that Exim's current
behavior of potentially transmitting CRCRLF or just CR is pretty much
(excepting for service extensions) wrong.
----------------
2.3.7 Lines
SMTP commands and, unless altered by a service extension, message
data, are transmitted in "lines". Lines consist of zero or more data
characters terminated by the sequence ASCII character "CR" (hex value
0D) followed immediately by ASCII character "LF" (hex value 0A).
This termination sequence is denoted as <CRLF> in this document.
Conforming implementations MUST NOT recognize or generate any other
character or character sequence as a line terminator. Limits MAY be
imposed on line lengths by servers (see section 4.5.3).
In addition, the appearance of "bare" "CR" or "LF" characters in text
(i.e., either without the other) has a long history of causing
problems in mail implementations and applications that use the mail
system as a tool. SMTP client implementations MUST NOT transmit
these characters except when they are intended as line terminators
and then MUST, as indicated above, transmit them only as a <CRLF>
sequence.
------------------
Cyrus would then also be wrong in recognizing CRCRLF as two line terminators
- but this isn't a Cyrus mailing list, and I don't think that lets Exim off
the hook for potentially sending something it's not supposed to.
> Those extra CRs you're seeing are removed at the other end of
> the transport pipe, as mandated by the standards.
I'm not sure where the standards specify that. Dropping any CRs is probably
just the most straightforward way of changing data from the SMTP-realm to the
Unix-realm, and I'm sure a lot of programs work that way. However, I don't
think it's necessarily something you can count on - and figure that you can
just blindly add as many CRs as you want expecting that someone else will
clean them up. And what if the other end does want to keep CRLF
(non-Unix-box, Cyrus, etc)?
> What we're talking about, with respect to changing message bodies, is
> that Exim shouldn't make changes that are visible OUTSIDE the scope of
> transport. This is also mandated (to varying degrees depending on
> context) by the standards.
Yeah it'd be nice to expect that what went in one end came out exactly
byte-for-byte the same in the other. But can anyone in their right mind
really count on that - given that you may not be sending to another unix box,
or may be passing through who-knows-how-many virus scanners or other gateways?
I think the reality is that line terminators are always going to be changing
to suit the needs of whatever agent is holding the message at the moment. If
you want to transfer any exact sequence of bytes, you'd have to base64 or
uuencode it to have any hope if it passing through unchanged.
What I'm talking about - is just making sure Exim follows RFC2822 and only
transmits CRLF over the wire, and not CRCRLF or just CR or LF or any other
weird combination. If this can be guaranteed then, then everything else
should work itself out, or at least we can really say it's not an Exim problem.
Barry