On Thu 05 Aug 1999, Philip Hazel wrote:
> On Thu, 5 Aug 1999, Paul Slootman wrote:
>
> > At least, 'find . -type f | xargs grep -il Content-Length' didn't show
> > anything...
> >
> > Could I request such a feature to be added? Then I won't have to see
> > those ugly ">from " lines anymore in my email.
>
> It's all very well asking for the feature, but what is its
> specification? Content-Length is not mentioned in any of the RFCs. Where
Well, RFCs discuss on-the-wire protocols and such, and not how messages
are stored. I don't think e.g. maildir format is described by an RFC?
> is it specified? What precisely does it mean? It is the body's length or
> does it include the headers? (If the latter, does it include itself -
> nasty self-referential problem there, so I guess not.) Are line endings
> counted as 1 or two characters? Is the blank line between headers and
> body counted or not? Who is supposed to add it and when? Is an MTA
> supposed to check an existing Content-Length header?
Hmm, in what order shall I answer all these questions.... :-)
I was used to having it back when I used smail. I thought exim also
supported it as I saw those lines in my mbox, but I discovered today
that procmail adds them if it's not already there. procmail also does
the from_hack thing if Content-Length isn't there, and I can't find a
way of turning the from_hack off in procmail.
It's the length of the body. It's used by MUAs to be able to read the
body in one go, as in read(mboxfd, bodybuf, content_length) . This
precludes the need to check every body line for the "\nFrom " sequence
which is the only other way of determining the next message.
Hence "line endings" is an abstract thing, it just counts whatever bytes
are in the body.
The blank line is not counted, nor is the trailing blank line.
> Messing with messages isn't really an MTA thing. Whatever the
Then the from_hack should be removed (*), as that is a perfect example of
messing with the (body of) messages! The Content-Length header is as
much "messing with messages" as adding a "Received" line (which is
acceptable behaviour for MTAs AFAIK).
(*) I don't seriously mean that, of course, it's just for argument's sake.
> specification is, it would be better if this was done by an MUA.
The MUA can't, as it can't determine for sure where the next message
begins (if from_hack is turned off).
BTW, here are a couple of relevant pieces from smail 3.2's docs:
From CHANGES:
: configurable header insertion/removal support can be used to
: support the Content-Length field, by adding the following
: generic attributes to the local transport:
:
: remove_header="Content-Length",
: append_header="Content-Length: $body_size"
From INSTALL:
: SMAIL AND SYSTEM V RELEASE 4
:
: On many SVR4 implementations the mailbox file format defines a
: Content-Length field that indicates the length of each message, in
: bytes. This obviates the need for inserting "> before lines beginning
: with "From " (and indeed, there are some problems with the AT&T-supplied
: version of mailx concerning message splitting, if you don't use the
: Content-Length header). Smail can be configured to generated
: Content-Length fields (and Content-Type fields). However, the
: compiled-in transports cannot do this. To configure generation of these
From samples/generic/transports:
: # IMPORTANT FOR SYSTEM V RELEASE 4 USERS
: #
: # The SVR4 mailx expects to find Content-Length header fields on
: # messages. If such a header is not found (or if a remote site
: # supplies an incorrect Content-Length header), then mailx may split
: # your mailbox file into messages at inappropriate boundaries. To
: # add a Content-Length field to messages appended to your mailbox
: # files, and sent to shell-command or file addresses, uncomment all
: # attributes that are indicated with "SVR4 mailbox format". This
: # will also ensure that you have a "Content-Type" field, defaulting
: # the content type to "text".
: #
: # You will likely also wish to uncomment unix_from_hack from the
: # local, pipe, and file transports, since prepending > to lines
: # starting with From is not necessary with this the SVR4 mailbox
: # format. You can also comment out the suffix="\n" lines in the
: # local, and file transports, since a blank line is not required
: # between messages for the SVR4 mailbox format.
Here's something I found while dredging the net for content-length.
I've included it to show I can relate to your POV :-) It's from
http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html
which goes on in more depth. In fact, it's almost convinced me :-)
: In the BSD format, the only safe way to add a message to a file is to
: mangle occurrences of the ``From '' delimiter in the body of messages to
: some other string, usually ``>From ''. This is mangling, not quoting,
: because it's not a reversible process (since ``>From '' is not also
: quoted.)
:
: Now, there are actually two very similar-looking file formats. One is
: the BSD format, which I've described. The other, which one might as well
: call the ``content-length'' format, is used by some SYSV-derived
: systems, notably Solaris. It's very similar, but subtly incompatible.
: This format does not quote ``From '' lines, but instead relies on a
: Content-Length header in the message proper to indicate the exact
: byte-position of the end of each message.
:
: This latter format is non-portable, easily-corruptible, and overall,
: brain-damaged (that's a technical term.) But I'll refrain from ranting
: about it again right now...
Anyway, thanks for your response. I'll think about it a bit more...
Paul Slootman
--
Better, faster, | home: paul@??? http://www.wurtel.demon.nl/
cheaper: | debian: paul@??? isdn4linux: paul@???
choose any two. | work: paul@??? Murphy Software, Enschede, NL