On 2023-07-22, Exim Bugzilla via Exim-dev <exim-dev@???> wrote:
> https://bugs.exim.org/show_bug.cgi?id=2998
>
> --- Comment #1 from Jeremy Harris <jgh146exb@???> ---
> The patch looks simple, but I can't pretend to understand that bit of
> RFC 2279. It seems to be taking about UCS-2 rather than UTF-8.
> Is a better description possible?
interestingly that RFC seems to use UCS-2 interchanably with UTF-16
There was an excellent discussion of WTF-8 (like UTF-8 but with
surrogates) somewhere on the ineternet (I thought wikipedia, but I
can't find it now)
https://unicodebook.readthedocs.io/unicode_encodings.html
section 7.5. UTF-16 surrogate pairs
This bug is mainly motiviated by postgresql only accepting well formed
UTF-8. so UTF-8 that encodes uFE01 is rejected and leads to
mis-behaviour.
--
Jasen.
🇺🇦 Слава Україні
--
## subscription configuration (requires account):
##
https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
## unsubscribe (doesn't require an account):
## exim-dev-unsubscribe@???
## Exim details at
http://www.exim.org/
## Please use the Wiki with this list -
http://wiki.exim.org/