[pcre-dev] [Bug 2527] Incomplete unicode handling in pcre2_…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 2527] Incomplete unicode handling in pcre2_substitute when converting to upper/lower case
https://bugs.exim.org/show_bug.cgi?id=2527

--- Comment #2 from Philip Hazel <ph10@???> ---
After some discussion and thought I have come to the conclusion that PCRE2_UCP
is needed. This is why: Characters < 128 are ASCII and are no problem;
characters > 255 are either unknown or Unicode, and one could perhaps just
assume Unicode. However, characters in the range 128-255 are an issue. There
are programs that use locales to handle such characters, so assuming Unicode is
wrong - and also not backwards-compatible. At the moment, Unicode is assumed if
UTF is set; I propose to change the code so that Unicode is used for case
tests/changes if *either* UTF *or* UCP is set.

--
You are receiving this mail because:
You are on the CC list for the bug.