Re: [pcre-dev] Capturing duplicate named subpatterns in cond…

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: Ahmad Amireh
CC: pcre-dev
Subject: Re: [pcre-dev] Capturing duplicate named subpatterns in conditional branches always refers to the last
On Thu, 12 Jul 2012, Ahmad Amireh wrote:

> I'm having trouble figuring out how to capture duplicate named subpatterns (as
> allowed by the PCRE_DUPNAMES option). My initial understanding from reading
> the manual was that I could refer to a subpattern(s) capture using a name
> instead of the capture order number. While that is true, it seems the named
> subpattern capture always refers to the _last_ branch in which it was
> _defined_ and not necessarily matched. The example in the manual is almost
> exactly like what I'm trying to do so I'll just use it to explain:
>
> (?J)(?<DN>Mon|Fri|Sun)(?:day)?|(?<DN>Tue)(?:sday)?|(?<DN>Wed)(?:nesday)?|(?<DN>Thu)(?:rsday)?|(?<DN>Sat)(?:urday)?


This example works for me when I test it using the pcretest program:

PCRE version 8.31 2012-07-06

/(?J)(?<DN>Mon|Fri|Sun)(?:day)?|(?<DN>Tue)(?:sday)?|(?<DN>Wed)(?:nesday)?|(?<DN>Thu)(?:rsday)?|(?<DN>Sat)(?:urday)?/
  Monday\CDN
 0: Monday
 1: Mon
  C Mon (3) DN
  Tuesday\CDN
 0: Tuesday
 1: <unset>
 2: Tue
  C Tue (3) DN
  Wednesday\CDN
 0: Wednesday
 1: <unset>
 2: <unset>
 3: Wed
  C Wed (3) DN
  Thursday\CDN
 0: Thursday
 1: <unset>
 2: <unset>
 3: <unset>
 4: Thu
  C Thu (3) DN
  Friday\CDN
 0: Friday
 1: Fri
  C Fri (3) DN
  Saturday\CDN     
 0: Saturday
 1: <unset>
 2: <unset>
 3: <unset>
 4: <unset>
 5: Sat
  C Sat (3) DN


The \CDN option on the data lines means "use pcre_copy_named_substring
to collect the value of substring DN after the match". The same test
works with \GDN (using pcre_get_named_substring).

How are you trying to extract the named substring? The two functions
mentioned above return the first substring with the given name that is
actually set.

Philip

--
Philip Hazel