[Pcre-svn] [996] code/trunk: Treat empty-string-matching rep…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [996] code/trunk: Treat empty-string-matching repeated conditionals the same as ordinary ones
Revision: 996
          http://www.exim.org/viewvc/pcre2?view=rev&revision=996
Author:   ph10
Date:     2018-09-03 16:20:40 +0100 (Mon, 03 Sep 2018)
Log Message:
-----------
Treat empty-string-matching repeated conditionals the same as ordinary ones 
when checking for an anchored pattern.


Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/src/pcre2_compile.c
    code/trunk/testdata/testinput2
    code/trunk/testdata/testoutput2


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2018-09-02 16:53:29 UTC (rev 995)
+++ code/trunk/ChangeLog    2018-09-03 15:20:40 UTC (rev 996)
@@ -179,7 +179,13 @@
 assumed empty second branch cannot be anchored. Demonstrated by test patterns 
 such as /(?(1)^())b/ or /(?(?=^))b/.


+40. A repeated conditional subpattern that could match an empty string was
+always assumed to be unanchored. Now it it checked just like any other
+repeated conditional subpattern, and can be found to be anchored if the minimum
+quantifier is one or more. I can't see much use for a repeated anchored
+pattern, but the behaviour is now consistent.

+
Version 10.31 12-February-2018
------------------------------


Modified: code/trunk/src/pcre2_compile.c
===================================================================
--- code/trunk/src/pcre2_compile.c    2018-09-02 16:53:29 UTC (rev 995)
+++ code/trunk/src/pcre2_compile.c    2018-09-03 15:20:40 UTC (rev 996)
@@ -7866,7 +7866,7 @@


    /* Condition. If there is no second branch, it can't be anchored. */


-   else if (op == OP_COND)
+   else if (op == OP_COND || op == OP_SCOND)
      {
      if (scode[GET(scode,1)] != OP_ALT) return FALSE;
      if (!is_anchored(scode, bracket_map, cb, atomcount, inassert))


Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2018-09-02 16:53:29 UTC (rev 995)
+++ code/trunk/testdata/testinput2    2018-09-03 15:20:40 UTC (rev 996)
@@ -5474,4 +5474,35 @@


/(?(1)^())b/I

+/(?(1)^())+b/I,aftertext
+    abc
+
+/(?(1)^()|^)+b/I,aftertext
+    bbc 
+\= Expect no match     
+    abc
+
+/(?(1)^()|^)*b/I,aftertext
+    bbc 
+    abc
+    xbc 
+
+/(?(1)^())+b/I,aftertext
+    abc
+
+/(?(1)^a()|^a)+b/I,aftertext
+    abc 
+\= Expect no match     
+    bbc
+
+/(?(1)^|^(a))+b/I,aftertext
+    abc 
+\= Expect no match     
+    bbc
+
+/(?(1)^a()|^a)*b/I,aftertext
+    abc 
+    bbc
+    xbc 
+
 # End of testinput2


Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2018-09-02 16:53:29 UTC (rev 995)
+++ code/trunk/testdata/testoutput2    2018-09-03 15:20:40 UTC (rev 996)
@@ -16671,6 +16671,98 @@
 Last code unit = 'b'
 Subject length lower bound = 1


+/(?(1)^())+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Last code unit = 'b'
+Subject length lower bound = 1
+    abc
+ 0: b
+ 0+ c
+
+/(?(1)^()|^)+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Compile options: <none>
+Overall options: anchored
+First code unit = 'b'
+Subject length lower bound = 1
+    bbc 
+ 0: b
+ 0+ bc
+\= Expect no match     
+    abc
+No match
+
+/(?(1)^()|^)*b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+First code unit = 'b'
+Subject length lower bound = 1
+    bbc 
+ 0: b
+ 0+ bc
+    abc
+ 0: b
+ 0+ c
+    xbc 
+ 0: b
+ 0+ c
+
+/(?(1)^())+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Last code unit = 'b'
+Subject length lower bound = 1
+    abc
+ 0: b
+ 0+ c
+
+/(?(1)^a()|^a)+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Compile options: <none>
+Overall options: anchored
+First code unit = 'a'
+Last code unit = 'b'
+Subject length lower bound = 2
+    abc 
+ 0: ab
+ 0+ c
+\= Expect no match     
+    bbc
+No match
+
+/(?(1)^|^(a))+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Compile options: <none>
+Overall options: anchored
+Last code unit = 'b'
+Subject length lower bound = 1
+    abc 
+ 0: ab
+ 0+ c
+ 1: a
+\= Expect no match     
+    bbc
+No match
+
+/(?(1)^a()|^a)*b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Last code unit = 'b'
+Subject length lower bound = 1
+    abc 
+ 0: ab
+ 0+ c
+    bbc
+ 0: b
+ 0+ bc
+    xbc 
+ 0: b
+ 0+ c
+
 # End of testinput2
 Error -70: PCRE2_ERROR_BADDATA (unknown error number)
 Error -62: bad serialized data