Revision: 320
http://www.exim.org/viewvc/pcre2?view=rev&revision=320
Author: ph10
Date: 2015-07-21 14:42:14 +0100 (Tue, 21 Jul 2015)
Log Message:
-----------
Fix "running for ever" bug for deeply nested [: sequences.
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/src/pcre2_compile.c
code/trunk/testdata/testinput2
code/trunk/testdata/testoutput2
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2015-07-20 10:17:23 UTC (rev 319)
+++ code/trunk/ChangeLog 2015-07-21 13:42:14 UTC (rev 320)
@@ -58,8 +58,12 @@
error diagnosis. Examples are: /[[:\\](?<[::]/ and /[[:\\](?'abc')[a:]. The
first of these bugs was discovered by Karl Skomski with the LLVM fuzzer.
+16. Pathological patterns containing many nested occurrences of [: caused
+pcre2_compile() to run for a very long time. This bug was found by the LLVM
+fuzzer.
+
Version 10.20 30-June-2015
--------------------------
Modified: code/trunk/src/pcre2_compile.c
===================================================================
--- code/trunk/src/pcre2_compile.c 2015-07-20 10:17:23 UTC (rev 319)
+++ code/trunk/src/pcre2_compile.c 2015-07-21 13:42:14 UTC (rev 320)
@@ -2583,7 +2583,9 @@
A user pointed out that PCRE was rejecting [:a[:digit:]] whereas Perl was not.
It seems that the appearance of a nested POSIX class supersedes an apparent
external class. For example, [:a[:digit:]b:] matches "a", "b", ":", or
-a digit.
+a digit. This is handled by returning FALSE if the start of a new group with
+the same terminator is encountered, since the next closing sequence must close
+the nested group, not the outer one.
In Perl, unescaped square brackets may also appear as part of class names. For
example, [:a[:abc]b:] gives unknown POSIX class "[:abc]b:]". However, for
@@ -2609,21 +2611,15 @@
if (*ptr == CHAR_BACKSLASH &&
(ptr[1] == CHAR_RIGHT_SQUARE_BRACKET || ptr[1] == CHAR_BACKSLASH))
ptr++;
- else if (*ptr == CHAR_RIGHT_SQUARE_BRACKET) return FALSE;
- else
+ else if ((*ptr == CHAR_LEFT_SQUARE_BRACKET && ptr[1] == terminator) ||
+ *ptr == CHAR_RIGHT_SQUARE_BRACKET) return FALSE;
+ else if (*ptr == terminator && ptr[1] == CHAR_RIGHT_SQUARE_BRACKET)
{
- if (*ptr == terminator && ptr[1] == CHAR_RIGHT_SQUARE_BRACKET)
- {
- *endptr = ptr;
- return TRUE;
- }
- if (*ptr == CHAR_LEFT_SQUARE_BRACKET &&
- (ptr[1] == CHAR_COLON || ptr[1] == CHAR_DOT ||
- ptr[1] == CHAR_EQUALS_SIGN) &&
- check_posix_syntax(ptr, endptr))
- return FALSE;
+ *endptr = ptr;
+ return TRUE;
}
}
+
return FALSE;
}
Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2 2015-07-20 10:17:23 UTC (rev 319)
+++ code/trunk/testdata/testinput2 2015-07-21 13:42:14 UTC (rev 320)
@@ -4350,4 +4350,6 @@
/[[:\\](?'abc')[a:]/I
+"[[[.\xe8Nq\xffq\xff\xe0\x2|||::Nq\xffq\xff\xe0\x6\x2|||::[[[:[::::::[[[[[::::::::[:[[[:[:::[[[[[[[[[[[[:::::::::::::::::[[.\xe8Nq\xffq\xff\xe0\x2|||::Nq\xffq\xff\xe0\x6\x2|||::[[[:[::::::[[[[[::::::::[:[[[:[:::[[[[[[[[[[[[[[:::E[[[:[:[[:[:::[[:::E[[[:[:[[:'[:::::E[[[:[::::::[[[:[[[[[[[::E[[[:[::::::[[[:[[[[[[[[:[[::[::::[[:::::::[[:[[[[[[[:[[::[:[[:[~"
+
# End of testinput2
Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2 2015-07-20 10:17:23 UTC (rev 319)
+++ code/trunk/testdata/testoutput2 2015-07-21 13:42:14 UTC (rev 320)
@@ -14534,4 +14534,7 @@
Starting code units: : [ \
Subject length lower bound = 2
+"[[[.\xe8Nq\xffq\xff\xe0\x2|||::Nq\xffq\xff\xe0\x6\x2|||::[[[:[::::::[[[[[::::::::[:[[[:[:::[[[[[[[[[[[[:::::::::::::::::[[.\xe8Nq\xffq\xff\xe0\x2|||::Nq\xffq\xff\xe0\x6\x2|||::[[[:[::::::[[[[[::::::::[:[[[:[:::[[[[[[[[[[[[[[:::E[[[:[:[[:[:::[[:::E[[[:[:[[:'[:::::E[[[:[::::::[[[:[[[[[[[::E[[[:[::::::[[[:[[[[[[[[:[[::[::::[[:::::::[[:[[[[[[[:[[::[:[[:[~"
+Failed: error 106 at offset 353: missing terminating ] for character class
+
# End of testinput2