[pcre-dev] [Bug 760] Named patterns with same index conflict

Top Page
Delete this message
Author: Stan Vassilev
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 760] Named patterns with same index conflict
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=760




--- Comment #4 from Stan Vassilev <sv4php@???> 2008-09-14 21:26:53 ---
I think firbiddig it may have some disruptive effects, as apart from the
name1/name2 collision, there also exists the no_name/name collision, for
example:

php> preg_match('/(?|foo()|bar(?<barPattern>))/', 'foo', $m); var_dump($m);

array(3) {
[0]=>
string(3) "foo"
["barPattern"]=>
string(0) ""
[1]=>
string(0) ""
}

Also let me explain my use case (I realize it's not very typical, but still):

What I use switch blocks (?|..|..) for right now, was to merge multiple regexes
into a single one, while preserving the subpattern numbers in each (necessary
as I may have backreferences etc. which are absolute).

Ex.:


/hi world([.!?])/i

/^[a-z]+$/m

/(.*?)\</s

would produce:

/(?|(?i:hi world([.!?]))|(?m:^[a-z]+$)|(?s:(.*?)\<))/

This works well, but at some point I had the need to see which segment exactly
matched in the group, so I went and used an empty named pattern in the end (as
not to abrupt the pattern numbers), as a marker of a sort, and checked the
name:

/(?|pat1(?P<patName1>)|pat2(?P<patName2>)|pat3(?P<patName3>))

It's then I noticed that names are reported on patterns never mind if the name
was in the matched segment or not.

A workaround I'm experimenting with is to prepend the markers with enough
'()'-s so that the marker of each segment has a unique index in the entire
pattern, then I can check the last named pattern name reported and that's my
marker.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email