[Pcre-svn] [231] code/trunk: Fix bugs when (?!) is used as a…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [231] code/trunk: Fix bugs when (?!) is used as a condition.
Revision: 231
          http://www.exim.org/viewvc/pcre2?view=rev&revision=231
Author:   ph10
Date:     2015-03-24 10:21:34 +0000 (Tue, 24 Mar 2015)


Log Message:
-----------
Fix bugs when (?!) is used as a condition.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/HACKING
    code/trunk/src/pcre2_compile.c
    code/trunk/src/pcre2_dfa_match.c
    code/trunk/src/pcre2_match.c
    code/trunk/testdata/testinput2
    code/trunk/testdata/testinput6
    code/trunk/testdata/testoutput2
    code/trunk/testdata/testoutput6


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2015-03-24 08:43:52 UTC (rev 230)
+++ code/trunk/ChangeLog    2015-03-24 10:21:34 UTC (rev 231)
@@ -14,10 +14,18 @@


4. Implemented pcre2_callout_enumerate().

-5. Fix JIT compilation of conditional blocks whose assertion
- is converted to (*FAIL). E.g: /(?(?!))/.
+5. Fix JIT compilation of conditional blocks whose assertion is converted to
+(*FAIL). E.g: /(?(?!))/.

+6. The pattern /(?(?!)^)/ caused references to random memory. This bug was
+discovered by the LLVM fuzzer.

+7. The assertion (?!) is optimized to (*FAIL). This was not handled correctly
+when this assertion was used as a condition, for example (?(?!)a|b). In
+pcre2_match() it worked by luck; in pcre2_dfa_match() it gave an incorrect
+error about an unsupported item.
+
+
Version 10.10 06-March-2015
---------------------------

@@ -120,12 +128,13 @@
to be compiled, leading to the error "internal error: previously-checked
referenced subpattern not found" when an incorrect memory address was read.
This bug was reported as "heap overflow", discovered by Kai Lu of Fortinet's
-FortiGuard Labs.
+FortiGuard Labs. (Added 24-March-2015: CVE-2015-2325 was given to this.)

23. A pattern such as "((?+1)(\1))/" containing a forward reference subroutine
call within a group that also contained a recursive back reference caused
incorrect code to be compiled. This bug was reported as "heap overflow",
-discovered by Kai Lu of Fortinet's FortiGuard Labs.
+discovered by Kai Lu of Fortinet's FortiGuard Labs. (Added 24-March-2015:
+CVE-2015-2326 was given to this.)

24. Computing the size of the JIT read-only data in advance has been a source
of various issues, and new ones are still appear unfortunately. To fix

Modified: code/trunk/HACKING
===================================================================
--- code/trunk/HACKING    2015-03-24 08:43:52 UTC (rev 230)
+++ code/trunk/HACKING    2015-03-24 10:21:34 UTC (rev 231)
@@ -210,7 +210,8 @@
   OP_THEN                )


OP_ASSERT_ACCEPT is used when (*ACCEPT) is encountered within an assertion.
-This ends the assertion, not the entire pattern match.
+This ends the assertion, not the entire pattern match. The assertion (?!) is
+always optimized to OP_FAIL.


Backtracking control verbs with optional data
@@ -528,7 +529,11 @@
callout at this point. Only assertion conditions may have callouts preceding
the condition.

+A condition that is the negative assertion (?!) is optimized to OP_FAIL in all
+parts of the pattern, so this is another opcode that may appear as a condition.
+It is treated the same as OP_FALSE.

+
Recursion
---------


Modified: code/trunk/src/pcre2_compile.c
===================================================================
--- code/trunk/src/pcre2_compile.c    2015-03-24 08:43:52 UTC (rev 230)
+++ code/trunk/src/pcre2_compile.c    2015-03-24 10:21:34 UTC (rev 231)
@@ -7284,7 +7284,7 @@
 count, because once again the assumption no longer holds.


 Arguments:
-  code           points to start of the compiled pattern
+  code           points to start of the compiled pattern or a group
   bracket_map    a bitmap of which brackets we are inside while testing; this
                    handles up to substring 31; after that we just have to take
                    the less precise approach
@@ -7321,6 +7321,7 @@
        case OP_DNCREF:
        case OP_RREF:
        case OP_DNRREF:
+       case OP_FAIL:
        case OP_FALSE:
        case OP_TRUE:
        return FALSE;


Modified: code/trunk/src/pcre2_dfa_match.c
===================================================================
--- code/trunk/src/pcre2_dfa_match.c    2015-03-24 08:43:52 UTC (rev 230)
+++ code/trunk/src/pcre2_dfa_match.c    2015-03-24 10:21:34 UTC (rev 231)
@@ -2660,14 +2660,15 @@
             condcode == OP_DNRREF)
           return PCRE2_ERROR_DFA_UCOND;


-        /* The DEFINE condition is always false */
-
-        if (condcode == OP_FALSE)
+        /* The DEFINE condition is always false, and the assertion (?!) is
+        converted to OP_FAIL. */
+        
+        if (condcode == OP_FALSE || condcode == OP_FAIL)
           { ADD_ACTIVE(state_offset + codelink + LINK_SIZE + 1, 0); }


         /* There is also an always-true condition */


-        if (condcode == OP_TRUE)
+        else if (condcode == OP_TRUE)
           { ADD_ACTIVE(state_offset + LINK_SIZE + 2 + IMM2_SIZE, 0); }


         /* The only supported version of OP_RREF is for the value RREF_ANY,


Modified: code/trunk/src/pcre2_match.c
===================================================================
--- code/trunk/src/pcre2_match.c    2015-03-24 08:43:52 UTC (rev 230)
+++ code/trunk/src/pcre2_match.c    2015-03-24 10:21:34 UTC (rev 231)
@@ -1408,6 +1408,7 @@
       break;


       case OP_FALSE:
+      case OP_FAIL:   /* The assertion (?!) becomes OP_FAIL */ 
       break;


       case OP_TRUE:


Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2015-03-24 08:43:52 UTC (rev 230)
+++ code/trunk/testdata/testinput2    2015-03-24 10:21:34 UTC (rev 231)
@@ -4229,4 +4229,11 @@
 / 61 28 3f 43 27 78 00 7a 27 29 62/hex,callout_info
     abcdefgh


+/(?(?!)^)/
+
+/(?(?!)a|b)/
+    bbb
+    ** Failers 
+    aaa 
+
 # End of testinput2 


Modified: code/trunk/testdata/testinput6
===================================================================
--- code/trunk/testdata/testinput6    2015-03-24 08:43:52 UTC (rev 230)
+++ code/trunk/testdata/testinput6    2015-03-24 10:21:34 UTC (rev 231)
@@ -4846,4 +4846,8 @@
 / 61 28 3f 43 27 78 00 7a 27 29 62/hex
     abcdefgh


+/(?(?!)a|b)/
+    bbb
+    aaa 
+
 # End of testinput6


Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2015-03-24 08:43:52 UTC (rev 230)
+++ code/trunk/testdata/testoutput2    2015-03-24 10:21:34 UTC (rev 231)
@@ -14188,4 +14188,14 @@
     ^^           b
  0: ab


+/(?(?!)^)/
+
+/(?(?!)a|b)/
+    bbb
+ 0: b
+    ** Failers 
+No match
+    aaa 
+No match
+
 # End of testinput2 


Modified: code/trunk/testdata/testoutput6
===================================================================
--- code/trunk/testdata/testoutput6    2015-03-24 08:43:52 UTC (rev 230)
+++ code/trunk/testdata/testoutput6    2015-03-24 10:21:34 UTC (rev 231)
@@ -7919,4 +7919,10 @@
     ^^           b
  0: ab


+/(?(?!)a|b)/
+    bbb
+ 0: b
+    aaa 
+No match
+
 # End of testinput6