On Mon, 10 Dec 2001 hoh@??? wrote:
> Many Swedish names use characters outside the a to z and this has
> always been a problem in email addresses. The usual method is to
> replace the "bad" characters with a similar looking "good" character.
> Sometimes a user forgets this and tries to send mail with a unusable
> destination address.
>
> Example:
>
> User enters t.täuber@their_domain.se and the users email client
> puts this To:-header
>
> To: =?iso-8859-1?Q?t=2Et=E4uber=40their_domain=2Ese?=
This is not a legal header, as I read the documents. RFC 2047 describes
how to handle special characters in headers. It says this:
5. Use of encoded-words in message headers
An 'encoded-word' may appear in a message header or body part header
according to the following rules:
(1) An 'encoded-word' may replace a 'text' token (as defined by RFC 822)
in any Subject or Comments header field, ...
(2) An 'encoded-word' may appear within a 'comment' delimited by "(" and
")", i.e., wherever a 'ctext' is allowed. ...
(3) As a replacement for a 'word' entity within a 'phrase', for example,
one that precedes an address in a From, To, or Cc header. ...
These are the ONLY locations where an 'encoded-word' may appear.
So I believe that the email client has screwed up. If you follow RFC2822
precisely, there is no way to get 8-bit characters into the local part
of an address.
Now, even if you could, RFC 2047 further notes that "IMPORTANT:
'encoded-word's are designed to be recognized as 'atom's by an RFC 822
parser." This means that you would never be able to encode a complete
email address as one 'encoded-word'. It would at the very least have to
be two, with @ in between them. However, as I said above, I don't think
this is currently a standard. However, the address
=?iso-8859-1?Q?t=2Et=E4uber?=@their.domain
is of course perfectly legal as far as RFC 2822 goes.
> The To:-header is then qualified and lowercased by
> Exim to
>
> To: =?iso-8859-1?q?t=2et=e4uber=40their_domain=2ese?=@our_domain.se
>
> resulting in confused users when they look in the headers of
> the bounced email.
I'm slightly surprised at the lowercasing (Exim 4 doesn't do it, but
perhaps I screwed up in Exim 3).
> Is there any easy way to detect badly formed email addresses in Exim
> and reject the email or should the users be modified instead?
This will be easier in Exim 4 where it is easier to do tests on incoming
mail at SMTP time.
--
Philip Hazel University of Cambridge Computing Service,
ph10@??? Cambridge, England. Phone: +44 1223 334714.