Revision: 1631
http://vcs.pcre.org/viewvc?view=rev&revision=1631
Author: ph10
Date: 2016-02-10 19:13:17 +0000 (Wed, 10 Feb 2016)
Log Message:
-----------
Fix workspace overflow for (*ACCEPT) with deeply nested parentheses.
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/pcre_compile.c
code/trunk/pcre_internal.h
code/trunk/pcreposix.c
code/trunk/testdata/testinput11
code/trunk/testdata/testoutput11-16
code/trunk/testdata/testoutput11-32
code/trunk/testdata/testoutput11-8
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/ChangeLog 2016-02-10 19:13:17 UTC (rev 1631)
@@ -7,17 +7,17 @@
Version 8.39 xx-xxxxxx-201x
---------------------------
-1. If PCRE_AUTO_CALLOUT was set on a pattern that had a (?# comment between
- an item and its qualifier (for example, A(?#comment)?B) pcre_compile()
+1. If PCRE_AUTO_CALLOUT was set on a pattern that had a (?# comment between
+ an item and its qualifier (for example, A(?#comment)?B) pcre_compile()
misbehaved. This bug was found by the LLVM fuzzer.
-
+
2. Similar to the above, if an isolated \E was present between an item and its
qualifier when PCRE_AUTO_CALLOUT was set, pcre_compile() misbehaved. This
bug was found by the LLVM fuzzer.
-3. Further to 8.38/46, negated classes such as [^[:^ascii:]\d] were also not
+3. Further to 8.38/46, negated classes such as [^[:^ascii:]\d] were also not
working correctly in UCP mode.
-
+
4. The POSIX wrapper function regexec() crashed if the option REG_STARTEND
was set when the pmatch argument was NULL. It now returns REG_INVARG.
@@ -26,31 +26,35 @@
6. An empty \Q\E sequence between an item and its qualifier caused
pcre_compile() to misbehave when auto callouts were enabled. This bug was
found by the LLVM fuzzer.
-
-7. If a pattern that was compiled with PCRE_EXTENDED started with white
- space or a #-type comment that was followed by (?-x), which turns off
+
+7. If a pattern that was compiled with PCRE_EXTENDED started with white
+ space or a #-type comment that was followed by (?-x), which turns off
PCRE_EXTENDED, and there was no subsequent (?x) to turn it on again,
pcre_compile() assumed that (?-x) applied to the whole pattern and
consequently mis-compiled it. This bug was found by the LLVM fuzzer.
-
+
8. An call of pcre_copy_named_substring() for a named substring whose number
was greater than the space in the ovector could cause a crash.
-
+
9. Yet another buffer overflow bug involved duplicate named groups with a
group that reset capture numbers (compare 8.38/7 below). Once again, I have
just allowed for more memory, even if not needed. (A proper fix is
implemented in PCRE2, but it involves a lot of refactoring.)
-
-10. pcre_get_substring_list() crashed if the use of \K in a match caused the
- start of the match to be earlier than the end.
+10. pcre_get_substring_list() crashed if the use of \K in a match caused the
+ start of the match to be earlier than the end.
+
11. Migrating appropriate PCRE2 JIT improvements to PCRE.
12. A pattern such as /(?<=((?C)0))/, which has a callout inside a lookbehind
- assertion, caused pcretest to generate incorrect output, and also to read
+ assertion, caused pcretest to generate incorrect output, and also to read
uninitialized memory (detected by ASAN or valgrind).
+13. A pattern that included (*ACCEPT) in the middle of a sufficiently deeply
+ nested set of parentheses of sufficient size caused an overflow of the
+ compiling workspace (which was diagnosed, but of course is not desirable).
+
Version 8.38 23-November-2015
-----------------------------
Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c 2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/pcre_compile.c 2016-02-10 19:13:17 UTC (rev 1631)
@@ -6,7 +6,7 @@
and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel
- Copyright (c) 1997-2014 University of Cambridge
+ Copyright (c) 1997-2016 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -560,6 +560,7 @@
/* 85 */
"parentheses are too deeply nested (stack check)\0"
"digits missing in \\x{} or \\o{}\0"
+ "regular expression is too complicated\0"
;
/* Table to identify digits and hex digits. This is used when compiling
@@ -4591,7 +4592,8 @@
if (code > cd->start_workspace + cd->workspace_size -
WORK_SIZE_SAFETY_MARGIN) /* Check for overrun */
{
- *errorcodeptr = ERR52;
+ *errorcodeptr = (code >= cd->start_workspace + cd->workspace_size)?
+ ERR52 : ERR87;
goto FAILED;
}
@@ -6626,8 +6628,21 @@
cd->had_accept = TRUE;
for (oc = cd->open_caps; oc != NULL; oc = oc->next)
{
- *code++ = OP_CLOSE;
- PUT2INC(code, 0, oc->number);
+ if (lengthptr != NULL)
+ {
+#ifdef COMPILE_PCRE8
+ *lengthptr += 1 + IMM2_SIZE;
+#elif defined COMPILE_PCRE16
+ *lengthptr += 2 + IMM2_SIZE;
+#elif defined COMPILE_PCRE32
+ *lengthptr += 4 + IMM2_SIZE;
+#endif
+ }
+ else
+ {
+ *code++ = OP_CLOSE;
+ PUT2INC(code, 0, oc->number);
+ }
}
setverb = *code++ =
(cd->assert_depth > 0)? OP_ASSERT_ACCEPT : OP_ACCEPT;
Modified: code/trunk/pcre_internal.h
===================================================================
--- code/trunk/pcre_internal.h 2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/pcre_internal.h 2016-02-10 19:13:17 UTC (rev 1631)
@@ -7,7 +7,7 @@
and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel
- Copyright (c) 1997-2014 University of Cambridge
+ Copyright (c) 1997-2016 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -2289,7 +2289,7 @@
ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79,
- ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERRCOUNT };
+ ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERR87, ERRCOUNT };
/* JIT compiling modes. The function list is indexed by them. */
Modified: code/trunk/pcreposix.c
===================================================================
--- code/trunk/pcreposix.c 2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/pcreposix.c 2016-02-10 19:13:17 UTC (rev 1631)
@@ -6,7 +6,7 @@
and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel
- Copyright (c) 1997-2014 University of Cambridge
+ Copyright (c) 1997-2016 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -173,7 +173,8 @@
REG_BADPAT, /* group name must start with a non-digit */
/* 85 */
REG_BADPAT, /* parentheses too deeply nested (stack check) */
- REG_BADPAT /* missing digits in \x{} or \o{} */
+ REG_BADPAT, /* missing digits in \x{} or \o{} */
+ REG_BADPAT /* pattern too complicated */
};
/* Table of texts corresponding to POSIX error codes */
Modified: code/trunk/testdata/testinput11
===================================================================
--- code/trunk/testdata/testinput11 2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/testdata/testinput11 2016-02-10 19:13:17 UTC (rev 1631)
@@ -138,4 +138,6 @@
/.((?2)(?R)\1)()/B

+
/-- End of testinput11 --/
Modified: code/trunk/testdata/testoutput11-16
===================================================================
--- code/trunk/testdata/testoutput11-16 2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/testdata/testoutput11-16 2016-02-10 19:13:17 UTC (rev 1631)
@@ -765,4 +765,7 @@
25 End
------------------------------------------------------------------

+Failed: regular expression is too complicated at offset 490
+
/-- End of testinput11 --/
Modified: code/trunk/testdata/testoutput11-32
===================================================================
--- code/trunk/testdata/testoutput11-32 2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/testdata/testoutput11-32 2016-02-10 19:13:17 UTC (rev 1631)
@@ -765,4 +765,7 @@
25 End
------------------------------------------------------------------

+Failed: missing ) at offset 509
+
/-- End of testinput11 --/
Modified: code/trunk/testdata/testoutput11-8
===================================================================
--- code/trunk/testdata/testoutput11-8 2016-02-10 10:53:45 UTC (rev 1630)
+++ code/trunk/testdata/testoutput11-8 2016-02-10 19:13:17 UTC (rev 1631)
@@ -765,4 +765,7 @@
38 End
------------------------------------------------------------------

+Failed: missing ) at offset 509
+
/-- End of testinput11 --/