[Pcre-svn] [1617] code/trunk: Fix /x bug when pattern starts…

Inizio della pagina
Delete this message
Autore: Subversion repository
Data:  
To: pcre-svn
Oggetto: [Pcre-svn] [1617] code/trunk: Fix /x bug when pattern starts with white space and (? -x)
Revision: 1617
          http://vcs.pcre.org/viewvc?view=rev&revision=1617
Author:   ph10
Date:     2015-12-03 17:05:40 +0000 (Thu, 03 Dec 2015)
Log Message:
-----------
Fix /x bug when pattern starts with white space and (?-x)


Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/pcre_compile.c
    code/trunk/testdata/testoutput2


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2015-11-30 17:44:45 UTC (rev 1616)
+++ code/trunk/ChangeLog    2015-12-03 17:05:40 UTC (rev 1617)
@@ -26,6 +26,12 @@
 6 . An empty \Q\E sequence between an item and its qualifier caused
     pcre_compile() to misbehave when auto callouts were enabled. This bug was
     found by the LLVM fuzzer.
+    
+7 . If a pattern that was compiled with PCRE_EXTENDED started with white 
+    space or a #-type comment that was followed by (?-x), which turns off 
+    PCRE_EXTENDED, and there was no subsequent (?x) to turn it on again,
+    pcre_compile() assumed that (?-x) applied to the whole pattern and
+    consequently mis-compiled it. This bug was found by the LLVM fuzzer.



Version 8.38 23-November-2015

Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c    2015-11-30 17:44:45 UTC (rev 1616)
+++ code/trunk/pcre_compile.c    2015-12-03 17:05:40 UTC (rev 1617)
@@ -7607,39 +7607,15 @@
         newoptions = (options | set) & (~unset);


         /* If the options ended with ')' this is not the start of a nested
-        group with option changes, so the options change at this level. If this
-        item is right at the start of the pattern, the options can be
-        abstracted and made external in the pre-compile phase, and ignored in
-        the compile phase. This can be helpful when matching -- for instance in
-        caseless checking of required bytes.
-
-        If the code pointer is not (cd->start_code + 1 + LINK_SIZE), we are
-        definitely *not* at the start of the pattern because something has been
-        compiled. In the pre-compile phase, however, the code pointer can have
-        that value after the start, because it gets reset as code is discarded
-        during the pre-compile. However, this can happen only at top level - if
-        we are within parentheses, the starting BRA will still be present. At
-        any parenthesis level, the length value can be used to test if anything
-        has been compiled at that level. Thus, a test for both these conditions
-        is necessary to ensure we correctly detect the start of the pattern in
-        both phases.
-
+        group with option changes, so the options change at this level. 
         If we are not at the pattern start, reset the greedy defaults and the
         case value for firstchar and reqchar. */


         if (*ptr == CHAR_RIGHT_PARENTHESIS)
           {
-          if (code == cd->start_code + 1 + LINK_SIZE &&
-               (lengthptr == NULL || *lengthptr == 2 + 2*LINK_SIZE))
-            {
-            cd->external_options = newoptions;
-            }
-          else
-            {
-            greedy_default = ((newoptions & PCRE_UNGREEDY) != 0);
-            greedy_non_default = greedy_default ^ 1;
-            req_caseopt = ((newoptions & PCRE_CASELESS) != 0)? REQ_CASELESS:0;
-            }
+          greedy_default = ((newoptions & PCRE_UNGREEDY) != 0);
+          greedy_non_default = greedy_default ^ 1;
+          req_caseopt = ((newoptions & PCRE_CASELESS) != 0)? REQ_CASELESS:0;


           /* Change options at this level, and pass them back for use
           in subsequent branches. */


Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2015-11-30 17:44:45 UTC (rev 1616)
+++ code/trunk/testdata/testoutput2    2015-12-03 17:05:40 UTC (rev 1617)
@@ -419,7 +419,7 @@


 /(?U)<.*>/I
 Capturing subpattern count = 0
-Options: ungreedy
+No options
 First char = '<'
 Need char = '>'
     abc<def>ghi<klm>nop
@@ -443,7 +443,7 @@


 /(?U)={3,}?/I
 Capturing subpattern count = 0
-Options: ungreedy
+No options
 First char = '='
 Need char = '='
     abc========def
@@ -477,7 +477,7 @@


/(?i)abc/I
Capturing subpattern count = 0
-Options: caseless
+No options
First char = 'a' (caseless)
Need char = 'c' (caseless)

@@ -489,7 +489,7 @@

/(?i)^1234/I
Capturing subpattern count = 0
-Options: anchored caseless
+Options: anchored
No first char
No need char

@@ -502,7 +502,7 @@
/(?s).*/I
Capturing subpattern count = 0
May match empty string
-Options: anchored dotall
+Options: anchored
No first char
No need char

@@ -516,7 +516,7 @@

/(?i)[abcd]/IS
Capturing subpattern count = 0
-Options: caseless
+No options
No first char
No need char
Subject length lower bound = 1
@@ -524,7 +524,7 @@

/(?m)[xy]|(b|c)/IS
Capturing subpattern count = 1
-Options: multiline
+No options
No first char
No need char
Subject length lower bound = 1
@@ -538,7 +538,7 @@

/(?i)(^a|^b)/Im
Capturing subpattern count = 1
-Options: caseless multiline
+Options: multiline
First char at start or follows newline
No need char

@@ -1179,7 +1179,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 1
-Options: anchored dotall
+Options: anchored
 No first char
 No need char


@@ -2735,7 +2735,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: caseless extended
+Options: extended
 First char = 'a' (caseless)
 Need char = 'c' (caseless)


@@ -2748,7 +2748,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: caseless extended
+Options: extended
 First char = 'a' (caseless)
 Need char = 'c' (caseless)


@@ -3095,7 +3095,7 @@
         End
 ------------------------------------------------------------------
 Capturing subpattern count = 0
-Options: ungreedy
+No options
 First char = 'x'
 Need char = 'b'
     xaaaab
@@ -3497,7 +3497,7 @@


/(?i)[ab]/IS
Capturing subpattern count = 0
-Options: caseless
+No options
No first char
No need char
Subject length lower bound = 1
@@ -6299,7 +6299,7 @@
Named capturing subpatterns:
A 2
A 3
-Options: anchored dupnames
+Options: anchored
Duplicate name status changes
No first char
No need char