[pcre-dev] [Bug 1697] New: Incorrect compilation of classes …

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1697] New: Incorrect compilation of classes containing ucase mnemonics and properties
https://bugs.exim.org/show_bug.cgi?id=1697

            Bug ID: 1697
           Summary: Incorrect compilation of classes containing ucase
                    mnemonics and properties
           Product: PCRE
           Version: 8.37
          Hardware: All
                OS: All
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: ph10@???
          Reporter: justin.viiret@???
                CC: pcre-dev@???


Some fuzzer testing of PCRE and Intel's Hyperscan pattern matching library
produced this pattern:

/[\W\p{Any}]/

.. compiled without the PCRE_UCP flag, which we would expect to match against
any character. Instead, we found that it behaves the same way as just the class
[\W]. Running 'pcretest -d' shows:

----
$ bin/pcretest -d
PCRE version 8.37 2015-04-28

re> /[\W\p{Any}]/

------------------------------------------------------------------
  0  36 Bra
  3     [\x00-/:-@[-^`{-\xff] (neg)
 36  36 Ket
 39     End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char

data> -

0: -
data> a

No match
----

My suspicion is that this is something to do with the interaction of a negated
class mnemonic (like \W, \D, \S) and the property xclass -- perhaps the
handling of the should_flip_negation bool in pcre_compile.c? The pattern above
is interpreted as /[\P{Xwd\p{Any}]/ if the PCRE_UCP flag is set, which looks
right, so it's only an issue without the flag.

I checked against PCRE2 10.20 as well, and it exhibits the same behaviour.

--
You are receiving this mail because:
You are on the CC list for the bug.