[Pcre-svn] [1032] code/trunk: Fix zero-repeated subroutine c…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [1032] code/trunk: Fix zero-repeated subroutine call at start of pattern bug, which recorded an
Revision: 1032
          http://www.exim.org/viewvc/pcre2?view=rev&revision=1032
Author:   ph10
Date:     2018-10-20 10:28:02 +0100 (Sat, 20 Oct 2018)
Log Message:
-----------
Fix zero-repeated subroutine call at start of pattern bug, which recorded an 
incorrect first code unit.


Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/src/pcre2_compile.c
    code/trunk/testdata/testinput1
    code/trunk/testdata/testoutput1


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2018-10-19 15:31:16 UTC (rev 1031)
+++ code/trunk/ChangeLog    2018-10-20 09:28:02 UTC (rev 1032)
@@ -43,7 +43,12 @@
 for tidiness - none of the substring extractors should reference this after 
 match failure.


+11. If a pattern started with a subroutine call that had a quantifier with a
+minimum of zero, an incorrect "match must start with this character" could be
+recorded. Example: /(?&xxx)*ABC(?<xxx>XYZ)/ would (incorrectly) expect 'A' to
+be the first character of a match.

+
Version 10.32 10-September-2018
-------------------------------


Modified: code/trunk/src/pcre2_compile.c
===================================================================
--- code/trunk/src/pcre2_compile.c    2018-10-19 15:31:16 UTC (rev 1031)
+++ code/trunk/src/pcre2_compile.c    2018-10-20 09:28:02 UTC (rev 1032)
@@ -6095,7 +6095,7 @@
       }
     goto GROUP_PROCESS_NOTE_EMPTY;


-    /* The DEFINE condition is always false. It's internal groups may never
+    /* The DEFINE condition is always false. Its internal groups may never
     be called, so matched_char must remain false, hence the jump to
     GROUP_PROCESS rather than GROUP_PROCESS_NOTE_EMPTY. */


@@ -6435,8 +6435,8 @@
           groupnumber = ng->number;


           /* For a recursion, that's all that is needed. We can now go to
-          the code above that handles numerical recursion, applying it to
-          the first group with the given name. */
+          the code that handles numerical recursion, applying it to the first
+          group with the given name. */


           if (meta == META_RECURSE_BYNAME)
             {
@@ -7486,6 +7486,8 @@
     groupsetfirstcu = FALSE;
     cb->had_recurse = TRUE;
     if (firstcuflags == REQ_UNSET) firstcuflags = REQ_NONE;
+    zerofirstcu = firstcu;
+    zerofirstcuflags = firstcuflags;   
     break;




Modified: code/trunk/testdata/testinput1
===================================================================
--- code/trunk/testdata/testinput1    2018-10-19 15:31:16 UTC (rev 1031)
+++ code/trunk/testdata/testinput1    2018-10-20 09:28:02 UTC (rev 1032)
@@ -6328,4 +6328,19 @@
 \= Expect no match
     aaaa      


+/   (?<word> \w+ )*    \.   /xi
+    pokus.
+    
+/(?(DEFINE) (?<word> \w+ ) ) (?&word)*   \./xi
+    pokus.
+
+/(?(DEFINE) (?<word> \w+ ) ) ( (?&word)* )   \./xi 
+    pokus.
+
+/(?&word)*  (?(DEFINE) (?<word> \w+ ) )  \./xi
+    pokus.
+
+/(?&word)*  \. (?<word> \w+ )/xi
+    pokus.hokus
+
 # End of testinput1 


Modified: code/trunk/testdata/testoutput1
===================================================================
--- code/trunk/testdata/testoutput1    2018-10-19 15:31:16 UTC (rev 1031)
+++ code/trunk/testdata/testoutput1    2018-10-20 09:28:02 UTC (rev 1032)
@@ -10025,4 +10025,28 @@
     aaaa      
 No match


+/   (?<word> \w+ )*    \.   /xi
+    pokus.
+ 0: pokus.
+ 1: pokus
+    
+/(?(DEFINE) (?<word> \w+ ) ) (?&word)*   \./xi
+    pokus.
+ 0: pokus.
+
+/(?(DEFINE) (?<word> \w+ ) ) ( (?&word)* )   \./xi 
+    pokus.
+ 0: pokus.
+ 1: <unset>
+ 2: pokus
+
+/(?&word)*  (?(DEFINE) (?<word> \w+ ) )  \./xi
+    pokus.
+ 0: pokus.
+
+/(?&word)*  \. (?<word> \w+ )/xi
+    pokus.hokus
+ 0: pokus.hokus
+ 1: hokus
+
 # End of testinput1