[Pcre-svn] [888] code/trunk: Fix incorrect first matching ch…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [888] code/trunk: Fix incorrect first matching character when a backreference with zero minimum
Revision: 888
          http://www.exim.org/viewvc/pcre2?view=rev&revision=888
Author:   ph10
Date:     2017-12-12 15:01:51 +0000 (Tue, 12 Dec 2017)
Log Message:
-----------
Fix incorrect first matching character when a backreference with zero minimum 
repeat starts a pattern (possibly after assertions).


Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/src/pcre2_compile.c
    code/trunk/testdata/testinput2
    code/trunk/testdata/testoutput2


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2017-12-08 10:25:49 UTC (rev 887)
+++ code/trunk/ChangeLog    2017-12-12 15:01:51 UTC (rev 888)
@@ -65,7 +65,12 @@
 lines were processed (assuming 32-bit ints). They have all been changed to 
 unsigned long ints.


+17. If a backreference with a minimum repeat count of zero was first in a
+pattern, apart from assertions, an incorrect first matching character could be
+recorded. For example, for the pattern /(?=(a))\1?b/, "b" was incorrectly set
+as the first character of a match.

+
Version 10.30 14-August-2017
----------------------------


Modified: code/trunk/src/pcre2_compile.c
===================================================================
--- code/trunk/src/pcre2_compile.c    2017-12-08 10:25:49 UTC (rev 887)
+++ code/trunk/src/pcre2_compile.c    2017-12-12 15:01:51 UTC (rev 888)
@@ -7135,7 +7135,7 @@
     later. */


     HANDLE_SINGLE_REFERENCE:
-    if (firstcuflags == REQ_UNSET) firstcuflags = REQ_NONE;
+    if (firstcuflags == REQ_UNSET) zerofirstcuflags = firstcuflags = REQ_NONE;
     *code++ = ((options & PCRE2_CASELESS) != 0)? OP_REFI : OP_REF;
     PUT2INC(code, 0, meta_arg);



Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2017-12-08 10:25:49 UTC (rev 887)
+++ code/trunk/testdata/testinput2    2017-12-12 15:01:51 UTC (rev 888)
@@ -5375,4 +5375,14 @@


/[\d-[:print:]]/

+# Perl gets the second of these wrong, giving no match.
+
+"(?<=(a))\1?b"I
+    ab
+    aaab 
+
+"(?=(a))\1?b"I
+    ab
+    aaab 
+
 # End of testinput2


Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2017-12-08 10:25:49 UTC (rev 887)
+++ code/trunk/testdata/testoutput2    2017-12-12 15:01:51 UTC (rev 888)
@@ -16340,6 +16340,34 @@
 /[\d-[:print:]]/
 Failed: error 150 at offset 3: invalid range in character class


+# Perl gets the second of these wrong, giving no match.
+
+"(?<=(a))\1?b"I
+Capturing subpattern count = 1
+Max back reference = 1
+Max lookbehind = 1
+Last code unit = 'b'
+Subject length lower bound = 1
+    ab
+ 0: b
+ 1: a
+    aaab 
+ 0: ab
+ 1: a
+
+"(?=(a))\1?b"I
+Capturing subpattern count = 1
+Max back reference = 1
+Starting code units: a 
+Last code unit = 'b'
+Subject length lower bound = 1
+    ab
+ 0: ab
+ 1: a
+    aaab 
+ 0: ab
+ 1: a
+
 # End of testinput2
 Error -65: PCRE2_ERROR_BADDATA (unknown error number)
 Error -62: bad serialized data