[pcre-dev] [Bug 1494] New: \Q..\E inside character class int…

Top Page
Delete this message
Author: Justin Viiret
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1494] New: \Q..\E inside character class interprets contents as literal sequence
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1494
           Summary: \Q..\E inside character class interprets contents as
                    literal sequence
           Product: PCRE
           Version: 8.35
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
        AssignedTo: ph10@???
        ReportedBy: justin.viiret@???
                CC: pcre-dev@???



Here's an unusual case that looks like a parsing issue with the \Q..\E quoted
sequence syntax within a character class.

If PCRE 8.30+ (including version 8.35) is given this expression, the characters
between the \Q and \E are interpreted as a literal sequence, rather than the
contents of a character class:

$ ./pcretest -d
PCRE version 8.35 2014-04-04

re> /[\Qa]\E]/

------------------------------------------------------------------
  0   7 Bra
  3     a]
  7   7 Ket
 10     End
------------------------------------------------------------------


However, reversing the order of the two characters causes PCRE to interpret
these as the contents of a character class:

$ ./pcretest -d
PCRE version 8.35 2014-04-04

re> /[\Q]a\E]/

------------------------------------------------------------------
  0  36 Bra
  3     [\]a]
 36  36 Ket
 39     End
------------------------------------------------------------------


I've checked version 8.21, and it interpreted these two patterns identically.

Notably, this issue seems to be dependent on there being one character followed
by "]" inside the \Q..\E; adding another character, as in /[\Qab]\E]/, produces
a character class.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email