Re: [exim-dev] Bad utf-8 in pgsql lookup and mainlog

Top Page
Delete this message
Reply to this message
Author: Axel Rau
Date:  
To: Phil Pennock
CC: exim-dev
Subject: Re: [exim-dev] Bad utf-8 in pgsql lookup and mainlog
Thanks, Phil, for this very helpful response.

Am 21.07.2013 um 05:35 schrieb Phil Pennock <pdp@???>:

> On 2013-07-20 at 19:05 +0200, Axel Rau wrote:
>> As exim works with utf-8 strings, my naive assumption was, that a header like
>>     Subject: Neue =?ISO-8859-1?q?Gl=E4ser?=
>> (RFC 2047) will be converted to utf-8 by exim before I access it via $h_Subject: .
>> Looking at the complexity of expand.c, this seems to be proved.
>> Can anybody confirm this?

>
> Exim's behaviour depends upon what value was defined for HEADERS_CHARSET
> in Local/Makefile when Exim was built. You also need HAVE_ICONV=yes but
> that's supplied by default on some OSes.
>
> The sample configuration supplied in src/EDITME sets
> HEADERS_CHARSET="ISO-8859-1".
>
> For myself, I always set HEADERS_CHARSET="UTF-8".


Indeed FreeBSD ports system defaults to ISO-8859-1.
I reinstalled with UTF-8. Unfortunately exim -bV does not list the HEADERS_CHARSET.
From my simple tests, it seems to be work:
    Subject: TEST =?ISO-8859-1?q?Gl=FCckliche_m=F6gliche_=C4chtung?=
was recorded correctly in UTF-8 in the DB. 

>
>> If the header contains none-ASCII 8-bit-characters (=illegal), I would like exim to replace them by "?".
>> Can this be done in the exim config or do we need a new expansion function for that?
>
> I *suspect* that a new expansion function would be needed, but I could
> be proven wrong by a particularly clever hack. I also suspect that, if
> we were to implement this, we'd default the replacement character to be
> codepoint 0xFFFD, the Unicode REPLACEMENT CHARACTER.



Wouldn't this be reasonable enhancement of the existing conversion functionality anyway?

Axel
---
PGP-Key:29E99DD6 ☀ +49 151 2300 9283 ☀ computing @ chaos claudius