Revision: 996
http://www.exim.org/viewvc/pcre2?view=rev&revision=996
Author: ph10
Date: 2018-09-03 16:20:40 +0100 (Mon, 03 Sep 2018)
Log Message:
-----------
Treat empty-string-matching repeated conditionals the same as ordinary ones
when checking for an anchored pattern.
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/src/pcre2_compile.c
code/trunk/testdata/testinput2
code/trunk/testdata/testoutput2
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2018-09-02 16:53:29 UTC (rev 995)
+++ code/trunk/ChangeLog 2018-09-03 15:20:40 UTC (rev 996)
@@ -179,7 +179,13 @@
assumed empty second branch cannot be anchored. Demonstrated by test patterns
such as /(?(1)^())b/ or /(?(?=^))b/.
+40. A repeated conditional subpattern that could match an empty string was
+always assumed to be unanchored. Now it it checked just like any other
+repeated conditional subpattern, and can be found to be anchored if the minimum
+quantifier is one or more. I can't see much use for a repeated anchored
+pattern, but the behaviour is now consistent.
+
Version 10.31 12-February-2018
------------------------------
Modified: code/trunk/src/pcre2_compile.c
===================================================================
--- code/trunk/src/pcre2_compile.c 2018-09-02 16:53:29 UTC (rev 995)
+++ code/trunk/src/pcre2_compile.c 2018-09-03 15:20:40 UTC (rev 996)
@@ -7866,7 +7866,7 @@
/* Condition. If there is no second branch, it can't be anchored. */
- else if (op == OP_COND)
+ else if (op == OP_COND || op == OP_SCOND)
{
if (scode[GET(scode,1)] != OP_ALT) return FALSE;
if (!is_anchored(scode, bracket_map, cb, atomcount, inassert))
Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2 2018-09-02 16:53:29 UTC (rev 995)
+++ code/trunk/testdata/testinput2 2018-09-03 15:20:40 UTC (rev 996)
@@ -5474,4 +5474,35 @@
/(?(1)^())b/I
+/(?(1)^())+b/I,aftertext
+ abc
+
+/(?(1)^()|^)+b/I,aftertext
+ bbc
+\= Expect no match
+ abc
+
+/(?(1)^()|^)*b/I,aftertext
+ bbc
+ abc
+ xbc
+
+/(?(1)^())+b/I,aftertext
+ abc
+
+/(?(1)^a()|^a)+b/I,aftertext
+ abc
+\= Expect no match
+ bbc
+
+/(?(1)^|^(a))+b/I,aftertext
+ abc
+\= Expect no match
+ bbc
+
+/(?(1)^a()|^a)*b/I,aftertext
+ abc
+ bbc
+ xbc
+
# End of testinput2
Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2 2018-09-02 16:53:29 UTC (rev 995)
+++ code/trunk/testdata/testoutput2 2018-09-03 15:20:40 UTC (rev 996)
@@ -16671,6 +16671,98 @@
Last code unit = 'b'
Subject length lower bound = 1
+/(?(1)^())+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Last code unit = 'b'
+Subject length lower bound = 1
+ abc
+ 0: b
+ 0+ c
+
+/(?(1)^()|^)+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Compile options: <none>
+Overall options: anchored
+First code unit = 'b'
+Subject length lower bound = 1
+ bbc
+ 0: b
+ 0+ bc
+\= Expect no match
+ abc
+No match
+
+/(?(1)^()|^)*b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+First code unit = 'b'
+Subject length lower bound = 1
+ bbc
+ 0: b
+ 0+ bc
+ abc
+ 0: b
+ 0+ c
+ xbc
+ 0: b
+ 0+ c
+
+/(?(1)^())+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Last code unit = 'b'
+Subject length lower bound = 1
+ abc
+ 0: b
+ 0+ c
+
+/(?(1)^a()|^a)+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Compile options: <none>
+Overall options: anchored
+First code unit = 'a'
+Last code unit = 'b'
+Subject length lower bound = 2
+ abc
+ 0: ab
+ 0+ c
+\= Expect no match
+ bbc
+No match
+
+/(?(1)^|^(a))+b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Compile options: <none>
+Overall options: anchored
+Last code unit = 'b'
+Subject length lower bound = 1
+ abc
+ 0: ab
+ 0+ c
+ 1: a
+\= Expect no match
+ bbc
+No match
+
+/(?(1)^a()|^a)*b/I,aftertext
+Capturing subpattern count = 1
+Max back reference = 1
+Last code unit = 'b'
+Subject length lower bound = 1
+ abc
+ 0: ab
+ 0+ c
+ bbc
+ 0: b
+ 0+ bc
+ xbc
+ 0: b
+ 0+ c
+
# End of testinput2
Error -70: PCRE2_ERROR_BADDATA (unknown error number)
Error -62: bad serialized data