Re: [Exim] Problems with CR translation in message bodies

Pàgina inicial
Delete this message
Reply to this message
Autor: Philip Hazel
Data:  
A: Ari Gordon-Schlosberg
CC: exim-users
Assumpte: Re: [Exim] Problems with CR translation in message bodies
On Fri, 26 Sep 2003, Ari Gordon-Schlosberg wrote:

> Ok, after another round of testing, I've come up with some interesting
> results. It seems that the problem is a little more complicated. SMTP
> injection directly into Exim appears to work just fine, as does piping to
> sendmail.


By "piping to sendmail" I assume really mean "piping to exim"?

> Now in the SoureForge cluster, we have some hosts running Exim 3.36.


That will make things different. The handling of CRs has changed since
3.36. In 3.36, CRs in the data are treated as data characters and just
passed on.

That explains what you see completely. I hadn't thought of this effect
when making the change to Exim 4.

> So I'm not sure who is in the wrong here, and it may very well be Exim 3.36
> and Postfix. However, this appears to be some sort of unintended
> consequence.


It is, but it's all because people will not be consistent about their
line ending usage.

My own view is that you should not expect any program on a Unix system
to behave in any one particular way if you feed it data with CRLF line
terminators, with the sole exception of a program that is reading SMTP
input. For SMTP, it is defined that CRLF is the line terminator. It also
says in the current SMTP RFC that "bare" CRs cause trouble. You can
certainly expect odd effects from those.

> I can furnish more data, if need be, including raw packet captures.


No need. I know exactly what is happening. Postfix and Exim 3.36 are
treating CR just like any other data character. After all, lines in Unix
files are terminated with LF, right? So the CRs get sent over the wires.
Each line on the wire will then end with CRCRLF (a CR data character,
and an SMTP line terminator).

However, Exim 4 was changed (after much discussion) to treat a "bare" CR
as a line terminator. (Apparently in MacOS CR is/was the line terminator
- is this right?) Hence it turns CRCRLF into two line terminators.

[Come back OS/370, where lines are defined by length. All is forgiven.]

> I think I'm going to advise our folks to strip out the CRs in the code, as
> it's actually not the correct behavior for our platform (Linux). We should
> just be passing LF as the newline, rather than CRLF.


YES, YES!!

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.
Get the Exim 4 book:    http://www.uit.co.uk/exim-book