Re: [Exim] bare linefeeds in SMTP

Top Page
Delete this message
Reply to this message
Author: Kjetil Torgrim Homme
Date:  
To: exim-users
New-Topics: Re: [Exim] bare linefeeds in SMTP
Subject: Re: [Exim] bare linefeeds in SMTP
On Fri, 2003-12-19 at 10:52, Philip Hazel wrote:
> On Thu, 18 Dec 2003, Kjetil Torgrim Homme wrote:
> > since Exim's queue format uses LF only (not wire format -- IMHO this
> > should be fixed for the next major release),
>
> Why? What possible advantage is there to changing the internal way in
> which Exim stores messages? I do not accept that this is a problem.
> Therefore, any proposal to alter it should be to "change" rather than to
> "fix".


sorry, didn't mean to rub you the wrong way.

> Whether Exim identifies lines by CR, CRLF, or LF terminators, or keeps
> then as count-plus-data, or uses any other valid format, should make not
> the slightest difference to its behaviour as seen from outside.


as long as Exim is able to reconstruct the original data, everything is
fine. but the current format has problems reconstructing bare CR, LF or
NUL. this is needed if Exim is to support RFC 3030, the binary data
extension for SMTP. (of course, Exim doesn't support 8BITMIME either,
which is very much related, so perhaps no one will ever feel the itch
needed to get hacking.)

> I can see no advantage whatsoever to making this change, and several
> *dis*advantages:
>
> (i) Work expended for no visible gain.
> (ii) All interfaces for receiving and delivering have to be changed. No
> doubt some bugs will get introduced. (Work expended for *negative* gain!)
> (ii) A nasty upgrade path for existing users with existing messages on
> their spools - or a very long overlap period which is bad news for the
> code maintainer.


good points, and this is why it must happen in a major release. it
isn't very hard to whittle down the old mail queue using the old binary,
or running something akin to unix2dos on it, but yes, more than one
system administrator will probably mess that up unless the release notes
are _crystal_.

the same transition was done in INN (the NNTP server). sure it was
painful, since it change the spool format, not just the queue, but it
was worth it. with the data in wire format (that is, even including the
CR LF . CR LF), resubmitting it is very fast and efficient. the CPU
doesn't have to touch the data at all, you can even use sendfile(2).
this is much more of an advantage for a news server which actually has
to send the same data many times. but ideally, the receiving code for
Exim will also have to do less data munging since the data _should_ be
on wire format already, only adding a Received header should be needed
by default.

another advantage is that the invariant is stronger. by making the
internal representation wire format, the exceptions to the format will
stick out as exceptions.

> Because wire format uses CRLF and Unix files use LF, translations are
> going to be necessary whatever you choose. For servers that are
> primarily relays, going into and out of LF format could be considered a
> waste, but for servers that are doing a lot of local->local delivery,
> the same could be said of going into and out of CRLF format if that were
> used. So as I said, I see no case at all for making any change.


my local format _is_ CR LF, since I use Cyrus as the backing store. and
every message makes at least three SMTP hops in our setup. I guess it
is not a typical setup, though.

> I am not sure what happens with Telnet connections to port 25, but this
> has long been a tradition for testing SMTP servers. If it is the case
> that only LF is sent, the server must accept it. Just think of the
> outcry I would get if I "broke" this usage.


UNIX telnet translates LF to CR LF on the client side, at least.

> message_prefix specifies *exactly* what is put on the front of the
> message. Nothing is added; no fixups are applied. [...]


thank you for the explanation.

> Possibly. I need to think about this. "If first line of DATA is
> terminated by CRLF, then fixup bare LF in header lines" might indeed be
> the best heuristic.


I'll have a look at it.

--
Kjetil T.