[Pcre-svn] [1631] code/trunk: Fix workspace overflow for (*A…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [1631] code/trunk: Fix workspace overflow for (*ACCEPT) with deeply nested parentheses.
Revision: 1631
          http://vcs.pcre.org/viewvc?view=rev&revision=1631
Author:   ph10
Date:     2016-02-10 19:13:17 +0000 (Wed, 10 Feb 2016)
Log Message:
-----------
Fix workspace overflow for (*ACCEPT) with deeply nested parentheses.


Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/pcre_compile.c
    code/trunk/pcre_internal.h
    code/trunk/pcreposix.c
    code/trunk/testdata/testinput11
    code/trunk/testdata/testoutput11-16
    code/trunk/testdata/testoutput11-32
    code/trunk/testdata/testoutput11-8


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/ChangeLog    2016-02-10 19:13:17 UTC (rev 1631)
@@ -7,17 +7,17 @@
 Version 8.39 xx-xxxxxx-201x
 ---------------------------


-1.  If PCRE_AUTO_CALLOUT was set on a pattern that had a (?# comment between 
-    an item and its qualifier (for example, A(?#comment)?B) pcre_compile() 
+1.  If PCRE_AUTO_CALLOUT was set on a pattern that had a (?# comment between
+    an item and its qualifier (for example, A(?#comment)?B) pcre_compile()
     misbehaved. This bug was found by the LLVM fuzzer.
-    
+
 2.  Similar to the above, if an isolated \E was present between an item and its
     qualifier when PCRE_AUTO_CALLOUT was set, pcre_compile() misbehaved. This
     bug was found by the LLVM fuzzer.


-3.  Further to 8.38/46, negated classes such as [^[:^ascii:]\d] were also not 
+3.  Further to 8.38/46, negated classes such as [^[:^ascii:]\d] were also not
     working correctly in UCP mode.
-    
+
 4.  The POSIX wrapper function regexec() crashed if the option REG_STARTEND
     was set when the pmatch argument was NULL. It now returns REG_INVARG.


@@ -26,31 +26,35 @@
 6.  An empty \Q\E sequence between an item and its qualifier caused
     pcre_compile() to misbehave when auto callouts were enabled. This bug was
     found by the LLVM fuzzer.
-    
-7.  If a pattern that was compiled with PCRE_EXTENDED started with white 
-    space or a #-type comment that was followed by (?-x), which turns off 
+
+7.  If a pattern that was compiled with PCRE_EXTENDED started with white
+    space or a #-type comment that was followed by (?-x), which turns off
     PCRE_EXTENDED, and there was no subsequent (?x) to turn it on again,
     pcre_compile() assumed that (?-x) applied to the whole pattern and
     consequently mis-compiled it. This bug was found by the LLVM fuzzer.
-    
+
 8.  An call of pcre_copy_named_substring() for a named substring whose number
     was greater than the space in the ovector could cause a crash.
-    
+
 9.  Yet another buffer overflow bug involved duplicate named groups with a
     group that reset capture numbers (compare 8.38/7 below). Once again, I have
     just allowed for more memory, even if not needed. (A proper fix is
     implemented in PCRE2, but it involves a lot of refactoring.)
-    
-10. pcre_get_substring_list() crashed if the use of \K in a match caused the 
-    start of the match to be earlier than the end. 


+10. pcre_get_substring_list() crashed if the use of \K in a match caused the
+    start of the match to be earlier than the end.
+
 11. Migrating appropriate PCRE2 JIT improvements to PCRE.


 12. A pattern such as /(?<=((?C)0))/, which has a callout inside a lookbehind
-    assertion, caused pcretest to generate incorrect output, and also to read 
+    assertion, caused pcretest to generate incorrect output, and also to read
     uninitialized memory (detected by ASAN or valgrind).


+13. A pattern that included (*ACCEPT) in the middle of a sufficiently deeply
+    nested set of parentheses of sufficient size caused an overflow of the
+    compiling workspace (which was diagnosed, but of course is not desirable).


+
Version 8.38 23-November-2015
-----------------------------


Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c    2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/pcre_compile.c    2016-02-10 19:13:17 UTC (rev 1631)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.


                        Written by Philip Hazel
-           Copyright (c) 1997-2014 University of Cambridge
+           Copyright (c) 1997-2016 University of Cambridge


-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -560,6 +560,7 @@
/* 85 */
"parentheses are too deeply nested (stack check)\0"
"digits missing in \\x{} or \\o{}\0"
+ "regular expression is too complicated\0"
;

 /* Table to identify digits and hex digits. This is used when compiling
@@ -4591,7 +4592,8 @@
     if (code > cd->start_workspace + cd->workspace_size -
         WORK_SIZE_SAFETY_MARGIN)                       /* Check for overrun */
       {
-      *errorcodeptr = ERR52;
+      *errorcodeptr = (code >= cd->start_workspace + cd->workspace_size)?
+        ERR52 : ERR87;
       goto FAILED;
       }


@@ -6626,8 +6628,21 @@
             cd->had_accept = TRUE;
             for (oc = cd->open_caps; oc != NULL; oc = oc->next)
               {
-              *code++ = OP_CLOSE;
-              PUT2INC(code, 0, oc->number);
+              if (lengthptr != NULL)
+                {
+#ifdef COMPILE_PCRE8
+                *lengthptr += 1 + IMM2_SIZE;
+#elif defined COMPILE_PCRE16
+                *lengthptr += 2 + IMM2_SIZE;
+#elif defined COMPILE_PCRE32
+                *lengthptr += 4 + IMM2_SIZE;
+#endif
+                }
+              else
+                {
+                *code++ = OP_CLOSE;
+                PUT2INC(code, 0, oc->number);
+                }
               }
             setverb = *code++ =
               (cd->assert_depth > 0)? OP_ASSERT_ACCEPT : OP_ACCEPT;


Modified: code/trunk/pcre_internal.h
===================================================================
--- code/trunk/pcre_internal.h    2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/pcre_internal.h    2016-02-10 19:13:17 UTC (rev 1631)
@@ -7,7 +7,7 @@
 and semantics are as close as possible to those of the Perl 5 language.


                        Written by Philip Hazel
-           Copyright (c) 1997-2014 University of Cambridge
+           Copyright (c) 1997-2016 University of Cambridge


 -----------------------------------------------------------------------------
 Redistribution and use in source and binary forms, with or without
@@ -2289,7 +2289,7 @@
        ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
        ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
        ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79,
-       ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERRCOUNT };
+       ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERR87, ERRCOUNT };


/* JIT compiling modes. The function list is indexed by them. */


Modified: code/trunk/pcreposix.c
===================================================================
--- code/trunk/pcreposix.c    2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/pcreposix.c    2016-02-10 19:13:17 UTC (rev 1631)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.


                        Written by Philip Hazel
-           Copyright (c) 1997-2014 University of Cambridge
+           Copyright (c) 1997-2016 University of Cambridge


-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -173,7 +173,8 @@
REG_BADPAT, /* group name must start with a non-digit */
/* 85 */
REG_BADPAT, /* parentheses too deeply nested (stack check) */
- REG_BADPAT /* missing digits in \x{} or \o{} */
+ REG_BADPAT, /* missing digits in \x{} or \o{} */
+ REG_BADPAT /* pattern too complicated */
};

/* Table of texts corresponding to POSIX error codes */

Modified: code/trunk/testdata/testinput11
===================================================================
--- code/trunk/testdata/testinput11    2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/testdata/testinput11    2016-02-10 19:13:17 UTC (rev 1631)
@@ -138,4 +138,6 @@


/.((?2)(?R)\1)()/B

+/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
+
/-- End of testinput11 --/

Modified: code/trunk/testdata/testoutput11-16
===================================================================
--- code/trunk/testdata/testoutput11-16    2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/testdata/testoutput11-16    2016-02-10 19:13:17 UTC (rev 1631)
@@ -765,4 +765,7 @@
  25     End
 ------------------------------------------------------------------


+/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
+Failed: regular expression is too complicated at offset 490
+
/-- End of testinput11 --/

Modified: code/trunk/testdata/testoutput11-32
===================================================================
--- code/trunk/testdata/testoutput11-32    2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/testdata/testoutput11-32    2016-02-10 19:13:17 UTC (rev 1631)
@@ -765,4 +765,7 @@
  25     End
 ------------------------------------------------------------------


+/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
+Failed: missing ) at offset 509
+
/-- End of testinput11 --/

Modified: code/trunk/testdata/testoutput11-8
===================================================================
--- code/trunk/testdata/testoutput11-8    2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/testdata/testoutput11-8    2016-02-10 19:13:17 UTC (rev 1631)
@@ -765,4 +765,7 @@
  38     End
 ------------------------------------------------------------------


+/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
+Failed: missing ) at offset 509
+
/-- End of testinput11 --/