https://bugs.exim.org/show_bug.cgi?id=2625
Bug ID: 2625
Summary: Unexpected caseless matching of ASCII "s" when using
"[\x{00FF}-\x{FFEE}]" in UTF-16 text
Product: PCRE
Version: 10.35 (PCRE2)
Hardware: x86-64
OS: All
Status: NEW
Severity: bug
Priority: medium
Component: Code
Assignee: ph10@???
Reporter: siegel@???
CC: pcre-dev@???
Using r1267, macOS 10.14.6 (x86_64):
Given this text represented as UTF-16:
this is a test
Search using this pattern, which as written should *not* match any ASCII
characters:
[\x{00FF}-\x{FFEE}]
If the pattern was compiled with PCRE2_CASELESS turned on, pcre2_match() will
return a match at the first "s" in the subject text, even though that is
outside the explicit range of characters. (And the uppercase version "S" would
be, as well.)
Further testing shows that "k" and "K" are matching as well, presumably with
the same underlying cause.
Invariant compile options are (PCRE2_UCP | PCRE2_MULTILINE |
PCRE2_AUTO_CALLOUT) and PCRE2_EXTRA_ESCAPED_CR_IS_LF is set in the extra flags
(pcre2_set_compile_extra_options()).
I regret I don't have a trivial test program to demonstrate this, but if you
find that you're not able to reproduce with this, please let me know and I'll
see if I can come up with something.
--
You are receiving this mail because:
You are on the CC list for the bug.