[pcre-dev] Using pcre: the /g behaviour

Top Page
Delete this message
Author: jonetsu
Date:  
To: pcre-dev
Subject: [pcre-dev] Using pcre: the /g behaviour
Hello,

Some guidance about using pcre would be much appreciated.

I am trying to extract values from a string (eventually strings
separated by Unix newlines) and found out about the named substrings,
which seems to be great, as I saw using pcredemo:

./pcredemo "rate (?'rate'\d+)Kbit" "quantum 12500 rate 300000Kbit ceil
540000Kbit"

Match succeeded at offset 14
0: rate 300000Kbit
1: 300000
Named substrings
(1) rate: 300000

But there's more than the rate to extract and this is where I haven't
found how to duplicate the behaviour of continuing a search, when using
pcredemo.

pcretest can do it, in this case finding both rate and ceil values by
adding the /g switch to the expression:

/rate (\d+)Kbit|ceil (\d+)Kbit/g
quantum 12500 rate 300000Kbit ceil 540000Kbit
0: rate 300000Kbit
1: 300000
0: ceil 540000Kbit
1: <unset>
2: 540000

Whereas pcredemo stops at the rate value (the input string is
continuous) :

./pcredemo "rate (\d+)Kbit|ceil (\d+)Kbit" "quantum 12500 rate
300000Kbit ceil 540000Kbit"

Match succeeded at offset 14
0: rate 300000Kbit
1: 300000

I have tried with the PCRE_NOTEMPTY_ATSTART|PCRE_ANCHORED options for
pcre_exec() but to no avail (eg. 'no match').

About this I have two questions:

1) How can the /g option be added to pcredemo ?

2) In the case of the finding both rate and ceil values as with
pcretest above, how come the vector index is again at 0 for the ceil
value and, what would the 'unset' mean ? I presume that if I name
those substring (pcretest does not seem to observe naming substrings)
then I would simply retrieve by name and not really bother about
'unset' values and index values.

Thanks for any suggestions and comments.