[Exim] Interpreting Subject: lines

Auteur: Philip Hazel
Date:
À: exim-users
Sujet: [Exim] Interpreting Subject: lines - opinions, please

The current state:

The $h_xxx: expansion in Exim contains the actual content of the line,
with leading/trailing white space removed, and commas inserted between
multiple lines when the header is one that contains addresses. The
$rh_xxx: expansion is the same, except that leading/trailing white
space is not removed, and no commas are inserted.

The cause of the problem:

Header lines can be encoded according to RFC2047. The most common uses
of this are to get non-ASCII characters into real names as part of
addresses, and also to get such characters into Subject: lines.

The problem:

Exim does not interpret header lines that are encoded. Thus, for
example, if a message contains the line

    Subject: Internet café

it might actually appear in the message as

    Subject: =?iso-8859-1?Q?Internet_caf=E9?=

This means that a user's filter file with a test such as

    if $h_subject: contains "Internet café"

won't work properly. Note that a string without any special characters
could be encoded this way just to confuse things.

The proposed solution:

An Exim user has written a patch which causes the contents of header
lines to be decoded according to the RFC 2047 rules before being put
into a $h_xxx variable.

I think this should be the default. We already have $rh_xxx for "raw"
headers, so the basic text would remain available for anybody that
needs it. However, this would be a (small?) incompatible change, which
is is why I'm asking:

Does anybody disagree?

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.