Re: [exim] IDN, UTF-8 and Punycode curiosity

Top Page
Delete this message
Reply to this message
Author: Phil Pennock
Date:  
To: Mark Elkins
CC: exim-users
Subject: Re: [exim] IDN, UTF-8 and Punycode curiosity
On 2012-01-04 at 18:33 +0200, Mark Elkins wrote:
> Phil:
> What exactly do you have 'allow_utf8_domains' and
> 'dns_check_names_pattern' set to and exactly where should they be
> included in the config file. I'm running exim-4.77 (Gentoo).
> Just like to get it right first time around.


In the main section of the config file, after the macros, before the
first "begin" line, I have:
----------------------------8< cut here >8------------------------------
allow_utf8_domains
# this pattern is straight from spec.txt for allow_utf8_domains:
dns_check_names_pattern = (?i)^(?>(?(1)\.|())[a-z0-9\xc0-\xff]\
        (?>[-a-z0-9\x80-\xff]*[a-z0-9\x80-\xbf])?)+$
----------------------------8< cut here >8------------------------------


> Are there mail clients (MUA) which translate between UTF-8 and PunyCode
> or is this the job of the MTA to try and sort out? Thunderbird and
> Evolution both fail with a similar


For internationalised email, done correctly, we're almost at the point
where we can say "the SMTP server advertises an extension in response to
EHLO which tells the client it can send UTF-8 data, and the SMTP server
is responsible for performing IDNA lookups".

RFC 5336 is _Experimental_, the IETF has worked on a Standards-track
document to replace it, visible at:
http://datatracker.ietf.org/doc/draft-ietf-eai-rfc5336bis/

On the 22nd November, the ietf-announce mailing-list noted that
draft-ietf-eai-rfc5336bis-16.txt was approved as a Proposed Standard.
However, that has not yet been published as an RFC.

The most likely course of action for Exim is to wait until whatever the
hold-up is has been resolved, then implement to the shiny new standards,
leaving the standards as default, while making sure that there are
escape hatches for folk who want to do something different. The first
release will probably have EXPERIMENTAL_* build-time guards.

Generally speaking, Exim is strongly biased towards adhering to the IETF
standards, but not committed to implementing all of them, and willing to
make the default behaviour not follow the standards published from
there, but only if there is a very compelling argument.

Eg, http://bugs.exim.org/817 where we're pretty much agreed to turn on
accept_8bitmime by default, because it's what most other systems do in
practice and *not* advertising 8BITMIME is causing more operational
problems than would be caused by advertising it and failing to
down-convert/bounce.

> It would probably be useful if exim could at least 'translate' puny to
> UTF in the final deliver stage (eg: deliver by mysql) - but I might be
> able to fudge that by having both versions of the name in my DB table -
> ie end up with e-mail delivered to my exim via punycode being
> appropriately placed in the correct UTF directory with UTF headers.
>
> Some native UTF-8/Punycode would be preferred though.


Yes, I went so far as to register a punycode domain so that I could test
any changes made to Exim for this, but I then changed employer and
things got hectic, so I failed to make any progress.

Regards,
-Phil