https://bugs.exim.org/show_bug.cgi?id=1697
Bug ID: 1697
Summary: Incorrect compilation of classes containing ucase
mnemonics and properties
Product: PCRE
Version: 8.37
Hardware: All
OS: All
Status: NEW
Severity: bug
Priority: medium
Component: Code
Assignee: ph10@???
Reporter: justin.viiret@???
CC: pcre-dev@???
Some fuzzer testing of PCRE and Intel's Hyperscan pattern matching library
produced this pattern:
/[\W\p{Any}]/
.. compiled without the PCRE_UCP flag, which we would expect to match against
any character. Instead, we found that it behaves the same way as just the class
[\W]. Running 'pcretest -d' shows:
----
$ bin/pcretest -d
PCRE version 8.37 2015-04-28
re> /[\W\p{Any}]/
------------------------------------------------------------------
0 36 Bra
3 [\x00-/:-@[-^`{-\xff] (neg)
36 36 Ket
39 End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char
data> -
0: -
data> a
No match
----
My suspicion is that this is something to do with the interaction of a negated
class mnemonic (like \W, \D, \S) and the property xclass -- perhaps the
handling of the should_flip_negation bool in pcre_compile.c? The pattern above
is interpreted as /[\P{Xwd\p{Any}]/ if the PCRE_UCP flag is set, which looks
right, so it's only an issue without the flag.
I checked against PCRE2 10.20 as well, and it exhibits the same behaviour.
--
You are receiving this mail because:
You are on the CC list for the bug.