Re: [exim] Exim and MySQL with UTF-8 encoding

Top Page
Delete this message
Reply to this message
Author: Yves Goergen
Date:  
To: Jeremy Harris, exim-users
Subject: Re: [exim] Exim and MySQL with UTF-8 encoding
Yes, it's the letter "ü" as I already showed, quoted-printable-encoded.
And it's only one byte. If the message were properly UTF-8-encoded,
there would be two bytes: =C3=BC This tells me that Exim generated a
message with Latin1 encoding (western default) instead of UTF-8. And I
can only imagine that this happens because it got Latin1 bytes from the
database instead of UTF-8 bytes, and then didn't care about it anymore
(why should it).

There are more non-ASCII letters in my test and all of them look like
this. There's also one non-Latin1 character that's just replaced by a
question mark (?) in the reply message. Probably MySQL did that because
it couldn't deliver the character through the Latin1 client connection.

Yves Goergen
http://unclassified.software

________________________________________
Von: Jeremy Harris
Gesendet: Do, 2017-11-09 15:10 +0100
On 08/11/17 21:15, Yves Goergen wrote:
> But the problem is that Exim doesn't talk to the MySQL server with UTF-8
> so it prevents using all that stuff. Instead, it uses some 8-bit
> encoding. I can see this in the reply message: It contains parts like
> =FC for "ü" where it should be at least two bytes.


That "=FC" might be an RFC-2047 encoded byte, perhaps?

Lowercase 'U' with umlaut appears to be Unicode U+00FC.