Re: [Exim] bare linefeeds in SMTP

Top Page
Delete this message
Reply to this message
Author: Philip Hazel
Date:  
To: Kjetil Torgrim Homme
CC: exim-users
Subject: Re: [Exim] bare linefeeds in SMTP
On Wed, 17 Dec 2003, Kjetil Torgrim Homme wrote:

> we're seeing a lot of corrupted spam messages today, due to Exim's
> handling of bare LF in an SMTP dialog. evidently, Exim will accept it
> as a proper line ending, which isn't satisfactory behaviour.


Sigh. *Sigh*. SIGH. S I G H! Sorry, I'm going to have to rant just a
little; I can't keep it all bottled up.

<rant>
This aspect of Exim keeps cropping up time and time again. I wish we
could kill it once and for all. I originally wrote Exim to be strict.
Input from local sources used LF as a terminator; input using SMTP over
TCP/IP used CRLF.

Likewise, output using SMTP over TCP/IP used CRLF, delivery to files
and pipes used LF because that's the Unix standard. (Internally, Exim
stores messages using LF line terminator.)

It wasn't long before I was asked to make it accept LF as well as CRLF
in SMTP over TCP/IP because some broken clients do this and "we have to
interwork with these clients because our boss says so"... So I
liberalized it.

Then somebody wanted delivery to files and pipes with CRLF terminators,
so I added an option for that.

Then the problem of bare CRs cropped up. Exim originally treated them as
just another data character. The first attempt to do something about
this was the -dropcr option, to just drop them.

However, people complained because messages coming from all sorts of
different sources (including systems where bare CR is a line terminator)
gave them line terminator problems. If you search the archives you will
find some of these discussions.

The upshot of all this was a change at release 4.21, which was recorded
thus:

58. Following a discussion on the list, the rules by which Exim recognises line
    endings on incoming messages have been changed. The -dropcr and drop_cr
    options are now no-ops, retained only for backwards compatibility. The
    following line terminators are recognized: LF CRLF CR. However, special
    processing applies to CR:


    (i)  The sequence CR . CR does *not* terminate an incoming SMTP message,
         nor a local message in the state where . is a terminator.


    (ii) If a bare CR is encountered in a header line, an extra space is added
         after the line terminator so as not to end the header. The reasoning
         behind this is that bare CRs in header lines are most likely either
         to be mistakes, or people trying to play silly games.


I suppose I should have learned from past experience that one size does
not fit everybody, and perhaps I should have implemented a whole slew of
options, but I did not fancy the task of designing and implementing
things like "allow bare LF from hosts x,y,z, and CRLF from a,b,c" and so
on.

SIGH once more....
</rant>

> 1) what to do with a bare LF.
>
> one alternative is to change it into CRLF SPC, but I think it's better
> to flatly reject the message with a reference to RFC 2822 section 2.2.


I suppose I could make it treat bare LF in a header like bare CR (see ii
above), but only if this was SMTP over TCP/IP, and there had been a
previous CRLF line terminator. Yuck.

> if you really, really want to support telnet sessions with no LF->CRLF
> translation, you can make a note of what style of line ending was used
> on the very first command (HELO or whatever).


There have been bad experiences in the past with software that has tried
to be clever like this. It tends to get confused when (by some kind of
mistake) the first line is terminated differently to the subsequent
lines. We saw this when the option to deliver messages locally with CRLF
was introduced. If you set that option, but use a message_prefix with
only LF, and the program you deliver to over a pipe tries to be clever
like this, you have a problem.

> 2) what to do with a "header line" without a colon.
>
> again, I favour rejecting the message. let's "force" people to follow
> the standards. the good guys generally do, it's the quick "let's write
> a visual basic program to spam millions of people" who get it wrong.


I used to think more like this when I first wrote Exim. I've since
become more liberal, mainly as a result of interacting with people who
are operating in a commercial environment.

On Thu, 18 Dec 2003, Kjetil Torgrim Homme wrote:

> being a University, we can afford to stick to standards. yes, this did
> once mean that e-mail confirmations from a travel agents who had bought
> a defective e-mail service didn't get through, but I hold that this is
> the travel agent's problem. if thousands of our employees can't use
> their service, that should be an incentive to get their system fixed.


You can just about hold that line at a university. I doubt very much
that a business would be able to.

> the current handling of LF must be fixed. it's not pretty to receive
> messages with no real headers, only the automatically generated ones.


I do not disagree with that.

> I doubt that is possible, the mangling of CRLF into LF seems to happen
> before the ACLs have a chance to take a peek:


Yes. The line terminators are sorted out as the message is received.

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.