Re: [pcre-dev] Capture not reset inside recursion

Top Page
Delete this message
Author: Zoltán Herczeg
Date:  
To: Pcre-dev@exim.org, ND
Subject: Re: [pcre-dev] Capture not reset inside recursion
The title is misleading, that feature is a JavaScript thing:

/(?:(a)b|\1)+/ matches aba in Perl, but not in JavaScript.

Anyway it looks like the problem here is ()? clears the capturing bracket in Perl when the empty case is selected while restores its previous value in PCRE2.

Matching /(?:(a)??b)+/ to abb also has this difference: the capturing bracket is empty in Perl, while set to a in PCRE2.

Even more interesting that /(?:(?:(a))??\1)+/ only matches to aa as well, while the body of the ?? should not be matched in the second iteration.

Let's do some debugging:
Match /(?:(?{ print "<$1>" })(?:(a))??(?{ print "[$1]" })\1)+/ to aaa

Output:
<>[][a]<a>[][a]

It the second iteration, the capturing bracket contains a before the ?? is executed, and reset to nothing after.

You will not belive this, but /(?:(?:(?{ print "!" })(a))?\1)+/ matches to aaa similar to PCRE2. The code block should have zero effect on the matching, still it disables something (probably an optimization) and works as expected.

Is this a perl bug?

Regards,
Zoltan
 
-------- Eredeti levél --------
Feladó: ND via Pcre-dev < pcre-dev@??? (Link -> mailto:pcre-dev@exim.org) >
Dátum: 2021 június 6 00:44:08
Tárgy: [pcre-dev] Capture not reset inside recursion
Címzett: Pcre-dev@??? (Link -> mailto:Pcre-dev@exim.org)
Here is pcretest listing:
PCRE2 version 10.35 2020-05-09
/(?:(a)?\1)+/
aaa
0: aaa
Expected result:
0: aa
Perl result:
0: aa
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev