[Pcre-svn] [635] code/trunk: Make the test for over-complica…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [635] code/trunk: Make the test for over-complication while auto-possessifying bite sooner.
Revision: 635
          http://www.exim.org/viewvc/pcre2?view=rev&revision=635
Author:   ph10
Date:     2016-12-31 13:35:31 +0000 (Sat, 31 Dec 2016)
Log Message:
-----------
Make the test for over-complication while auto-possessifying bite sooner.


Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/src/pcre2_auto_possess.c
    code/trunk/testdata/testinput15
    code/trunk/testdata/testoutput15


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2016-12-29 16:29:05 UTC (rev 634)
+++ code/trunk/ChangeLog    2016-12-31 13:35:31 UTC (rev 635)
@@ -272,7 +272,12 @@
 41. A minor change to pcre2grep: colour reset is now "<esc>[0m" instead of 
 "<esc>[00m".


+42. The limit in the auto-possessification code that was intended to catch
+overly-complicated patterns and not spend too much time auto-possessifying was
+being reset too often, resulting in very long compile times for some patterns.
+Now such patterns are no longer completely auto-possessified.

+
Version 10.22 29-July-2016
--------------------------


Modified: code/trunk/src/pcre2_auto_possess.c
===================================================================
--- code/trunk/src/pcre2_auto_possess.c    2016-12-29 16:29:05 UTC (rev 634)
+++ code/trunk/src/pcre2_auto_possess.c    2016-12-31 13:35:31 UTC (rev 635)
@@ -1046,8 +1046,10 @@


/* Replaces single character iterations with their possessive alternatives
if appropriate. This function modifies the compiled opcode! Hitting a
-non-existant opcode may indicate a bug in PCRE2, but it can also be caused if a
-bad UTF string was compiled with PCRE2_NO_UTF_CHECK.
+non-existent opcode may indicate a bug in PCRE2, but it can also be caused if a
+bad UTF string was compiled with PCRE2_NO_UTF_CHECK. The rec_limit catches
+overly complicated or large patterns. In these cases, the check just stops,
+leaving the remainder of the pattern unpossessified.

 Arguments:
   code        points to start of the byte code
@@ -1065,7 +1067,7 @@
 PCRE2_SPTR end;
 PCRE2_UCHAR *repeat_opcode;
 uint32_t list[8];
-int rec_limit;
+int rec_limit = 10000;


 for (;;)
   {
@@ -1080,7 +1082,6 @@
       get_chr_property_list(code, utf, cb->fcc, list) : NULL;
     list[1] = c == OP_STAR || c == OP_PLUS || c == OP_QUERY || c == OP_UPTO;


-    rec_limit = 1000;
     if (end != NULL && compare_opcodes(end, utf, cb, list, end, &rec_limit))
       {
       switch(c)
@@ -1137,7 +1138,6 @@


       list[1] = (c & 1) == 0;


-      rec_limit = 1000;
       if (compare_opcodes(end, utf, cb, list, end, &rec_limit))
         {
         switch (c)


Modified: code/trunk/testdata/testinput15
===================================================================
--- code/trunk/testdata/testinput15    2016-12-29 16:29:05 UTC (rev 634)
+++ code/trunk/testdata/testinput15    2016-12-31 13:35:31 UTC (rev 635)
@@ -160,4 +160,9 @@
 /(*NO_AUTO_POSSESS)\w+(?C1)/BI
     abc\=callout_fail=1


+# This test breaks the JIT stack limit 
+
+/(|]+){2,2452}/
+    (|]+){2,2452}
+
 # End of testinput15


Modified: code/trunk/testdata/testoutput15
===================================================================
--- code/trunk/testdata/testoutput15    2016-12-29 16:29:05 UTC (rev 634)
+++ code/trunk/testdata/testoutput15    2016-12-31 13:35:31 UTC (rev 635)
@@ -380,4 +380,11 @@
   1   ^^    
 No match


+# This test breaks the JIT stack limit 
+
+/(|]+){2,2452}/
+    (|]+){2,2452}
+ 0: 
+ 1: 
+
 # End of testinput15