Re: [pcre-dev] Bactracking controls in subroutines

Top Page
Delete this message
Author: ph10
Date:  
To: ND
CC: Pcre-dev
Subject: Re: [pcre-dev] Bactracking controls in subroutines
On Tue, 3 Jul 2018, ND via Pcre-dev wrote:

> PCRE2 version 10.31 2018-02-12
> /(?1)(*F)|(a(*COMMIT))/
> a
> 0: a
> 1: a
>
>
> In Perl this pattern not matched.


> Is there weighty reason to stay backtracking controls not "cross-subroutine"
> when backtracking is cross-subroutine? And stay incompatible with Perl?


This seems another situation where Perl is inconsistent. Look at this
example:

Perl 5.026002 Regular Expressions

/(a(*COMMIT)b){0}a(?1)|aac/
    aac
 0: aac


Both Perl and PCRE2 match in this case. But if I make the PCRE2
interpreter not constrain the (*COMMIT) to the subroutine call, this
match fails (and your example above also fails). There are some other
cases in testinput1 that become different to Perl. For example,
/(?1)(A(*COMMIT)|B)D/ no longer matches "ABD" in the subject "ABXABD".

So it looks as though Perl *sometimes* confines (*COMMIT) to a
subroutine call. Hmm. These Perl examples are confusing:

Perl 5.026002 Regular Expressions

/(?1)(A(*COMMIT)|B)D/
    ABCABD
 0: ABD
 1: B


/(A(*COMMIT)|B)(A(*COMMIT)|B)D/
    ABCABD
 0: ABD
 1: A
 2: B


PCRE2 matches the first one, but gives "no match" for the second. There
are no subroutine calls in the second, so it looks like a Perl bug. I
will report it.

I am not going to make any changes to PCRE2 just at the moment. What it
does is documented and is (mostly) compatible with Perl. If/when Perl
changes I might look at this again.

Unless, of course, I am mistaken in how I've interpreted these examples.

Philip

--
Philip Hazel