Re: [pcre-dev] Using pcre: the /g behaviour

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: jonetsu
CC: pcre-dev
Subject: Re: [pcre-dev] Using pcre: the /g behaviour
On Wed, 22 Feb 2012, jonetsu wrote:

> But there's more than the rate to extract and this is where I haven't
> found how to duplicate the behaviour of continuing a search, when using
> pcredemo.


pcredemo -g

> 1) How can the /g option be added to pcredemo ?


pcredemo -g

> 2) In the case of the finding both rate and ceil values as with
> pcretest above, how come the vector index is again at 0 for the ceil
> value and, what would the 'unset' mean ?


Because there are two separate match operations. Unset means that that
subpattern did not participate in the match.

> /rate (\d+)Kbit|ceil (\d+)Kbit/g
> quantum 12500 rate 300000Kbit ceil 540000Kbit
> 0: rate 300000Kbit
> 1: 300000
> 0: ceil 540000Kbit
> 1: <unset>
> 2: 540000


In the first match, subpattern 1 matched 300000. In the second match
(starting from the end of the first one) subpattern 1 did not match
anything, but subpattern 2 matched 540000.

Note that there is more subtlety than you might expect in implementing
/g (or -g). See the code of pcredemo.c for details.

> I presume that if I name those substring (pcretest does not seem to
> observe naming substrings) then I would simply retrieve by name and
> not really bother about 'unset' values and index values.


Yes, you can use pcre_get_named_subtring() or
pcre_pcre_copy_named_substring() to do that. To get pcretest to display
the values captured by name, you have to use the \C or \G escapes in the
data lines.

Philip

--
Philip Hazel