[pcre-dev] [Bug 2309] Definition order of self-referencing …

Top Page

Reply to this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 2309] Definition order of self-referencing capturing duplicate named groups respective to their initial initialization changes behavior
https://bugs.exim.org/show_bug.cgi?id=2309

Philip Hazel <ph10@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WONTFIX
             Status|NEW                         |RESOLVED


--- Comment #1 from Philip Hazel <ph10@???> ---
Thanks for your input, but there are two problems with this. This first is that
the current behaviour is the same as Perl's, and changing it would break Perl
compatibility. The second is that PCRE has no record of the order in which
groups were set, so "the most recently set capturing group" has no meaning.

Both PCRE and Perl treat group names as simple aliases for group numbers. No
doubt this is because, in both cases, they were bolted onto existing
number-based code. I never did like the idea of duplicate names; PCRE had named
groups before Perl, and the J option was added only when Perl allowed them. The
branch reset example works because two different groups set a value for group
1, and there is only one "slot" for remembering what group 1 captures, so \1
will indeed reference the most recently set value.

I have resisted the idea of allowing different names for the same group number,
as Perl does, because I think that really is too confusing. For example, Perl
allows the pattern /(?|(?<AA>aa)|(?<BB>bb))/ but after a successful match,
referencing the capture by either name gives the same value. I wasn't expecting
this when I recently did some tests, but the behaviour is documented. Both
names, AA and BB are just aliases for group 1.

I'm sure it would all be different if one were starting again from scratch, but
we've got where we are by evolution, which is well-known to be messy! I
understand your concern, but I do not think it would be right to break Perl
compatibility.

--
You are receiving this mail because:
You are on the CC list for the bug.