Re: [exim] Internationalized email

Top Page
Delete this message
Reply to this message
Author: Viktor Dukhovni
Date:  
To: exim-users
CC: Wietse Venema
Subject: Re: [exim] Internationalized email
On Wed, Apr 22, 2015 at 09:52:44PM +0100, Jeremy Harris wrote:

> > Question for the Exim team. It seems you have code for converting
> > UTF-8 localparts to ASCII. What encoding do you use for that?
>
> "xn--" <punycode>
>
> (i.e. like a domain element)


That's rather unwise IMHO. The "xn--" prefix is only applicable
to domain names (IDNA), and should not be used to encode localparts.

> > I am not aware of anything in EAI that standardises such an encoding.
>
> Nor me.


If you're going to encode, at least use "xl--", which is already
in use another MTA.

> > It seems that Microsoft has an ad-hoc ASCII encoding for UTF-8
> > localparts, are you using the same one? They use "xl--" followed
> > by the Punycode encoding of the localpart.
> >
> >     https://msdn.microsoft.com/en-us/library/dn600431.aspx

> >
> > Are we heading towards a de-facto standard here? Or just multiple
> > ad-hoc approaches.
>
> What's the advantage of "xl--" over "xn--" ?


In domains, the Punicode encoding is applied one label at a time.
There is no such structure in localparts, and Microsoft's encoding
(likely sensibly) just encodes the entire string in one go, "dots
and all.

> Not needing to know the context, that it is a localpart
> rather than a domain, when doing a decode?
>
> It would not be hard to change this aspect of the Exim implementation.


As you can see there's a bit more to it, because the encoding is:

    xl--<PunyCode>


not

    xn--<Punycode>.xn--<MorePunycode>...


for each "label" in the localpart (which is not a DNS name).

It might also make sense to discuss this with some of the IETF
"apps" WG folks who worked on EAI, and hear whehther they want to
discourage you from emulating Microsoft's approach and how compelling
their argument is.

-- 
    Viktor.