On 2019-07-01 10:28, ph10 wrote:
> On Sun, 30 Jun 2019, ND via Pcre-dev wrote:
>> PCRE2 version 10.33 2019-04-16
> > /\A(?:.|..)(*THEN)c/
> > abc
> > No match
> >>> Perl is match "abc".
> > I suppose "next innermost alternative" is interpreted differently by
> PCRE and
> > Perl.
> >> If so, may be PCRE should go Perl way in this matter?
>I think this is a bug in Perl and I will report it as such.
After reading this post
https://rt.perl.org/Public/Bug/Display.html?id=92898#txn-1227153
I don't sure that there is a Perl bug.
I suppose that there are two branches started from "(?:.|..)". Each of
this branches ends with a common TAIL to end of pattern. Here are this two
branches:
1) .(*THEN)c
2) ..(*THEN)c
Lets look at the Perl debug output:
Matching REx "\A(?:.|..)(*THEN)c" against "abcd"
Intuit: trying to determine minimum start position...
doing 'check' fbm scan, [1..3] gave 2
Found floating substr "c" at offset 2 (rx_origin now 0)...
(multiline anchor test skipped)
Intuit: Successfully guessed: match at offset 0
0 <> <abcd> | 0| 1:SBOL /\A/(2)
0 <> <abcd> | 0| 2:BRANCH(4)
0 <> <abcd> | 1| 3:REG_ANY(8)
1 <a> <bcd> | 1| 8:CUTGROUP(10)
1 <a> <bcd> | 2| 10:EXACT <c>(12)
| 2| failed...
| 1| failed...
0 <> <abcd> | 0| 4:BRANCH(7)
0 <> <abcd> | 1| 5:REG_ANY(6)
1 <a> <bcd> | 1| 6:REG_ANY(8)
2 <ab> <cd> | 1| 8:CUTGROUP(10)
2 <ab> <cd> | 2| 10:EXACT <c>(12)
3 <abc> <d> | 2| 12:END(0)
Match successful!
So backtracking to (*THEN) in BRANCH(4) caused immediately fail of this
branch and jump to BRANCH(7).