On Thu, 12 Jul 2012, Ahmad Amireh wrote:
> I'm having trouble figuring out how to capture duplicate named subpatterns (as
> allowed by the PCRE_DUPNAMES option). My initial understanding from reading
> the manual was that I could refer to a subpattern(s) capture using a name
> instead of the capture order number. While that is true, it seems the named
> subpattern capture always refers to the _last_ branch in which it was
> _defined_ and not necessarily matched. The example in the manual is almost
> exactly like what I'm trying to do so I'll just use it to explain:
>
> (?J)(?<DN>Mon|Fri|Sun)(?:day)?|(?<DN>Tue)(?:sday)?|(?<DN>Wed)(?:nesday)?|(?<DN>Thu)(?:rsday)?|(?<DN>Sat)(?:urday)?
This example works for me when I test it using the pcretest program:
PCRE version 8.31 2012-07-06
/(?J)(?<DN>Mon|Fri|Sun)(?:day)?|(?<DN>Tue)(?:sday)?|(?<DN>Wed)(?:nesday)?|(?<DN>Thu)(?:rsday)?|(?<DN>Sat)(?:urday)?/
Monday\CDN
0: Monday
1: Mon
C Mon (3) DN
Tuesday\CDN
0: Tuesday
1: <unset>
2: Tue
C Tue (3) DN
Wednesday\CDN
0: Wednesday
1: <unset>
2: <unset>
3: Wed
C Wed (3) DN
Thursday\CDN
0: Thursday
1: <unset>
2: <unset>
3: <unset>
4: Thu
C Thu (3) DN
Friday\CDN
0: Friday
1: Fri
C Fri (3) DN
Saturday\CDN
0: Saturday
1: <unset>
2: <unset>
3: <unset>
4: <unset>
5: Sat
C Sat (3) DN
The \CDN option on the data lines means "use pcre_copy_named_substring
to collect the value of substring DN after the match". The same test
works with \GDN (using pcre_get_named_substring).
How are you trying to extract the named substring? The two functions
mentioned above return the first substring with the given name that is
actually set.
Philip
--
Philip Hazel