Re: [pcre-dev] need help with a particular regex

Top Pagina
Delete this message
Auteur: Nuno Lopes
Datum:  
Aan: pcre-dev
Onderwerp: Re: [pcre-dev] need help with a particular regex
OK, thanks for the explanation (both Philip and Sheri)! I'll try your
suggestions.

Thanks,
Nuno


----- Original Message -----
From: "Philip Hazel" <ph10@???>
To: "Nuno Lopes" <nunoplopes@???>
Cc: <pcre-dev@???>
Sent: Thursday, September 13, 2007 2:18 PM
Subject: Re: [pcre-dev] need help with a particular regex


> On Thu, 13 Sep 2007, Nuno Lopes wrote:
>
>> uhm, my idea was like: match until you can advance to the next state
>
> Except for the fact that there is no concept of "states" in Perl-style
> regex matching, that is what is notated by the atomic group concept.
>
> (?> ... something ... )
>
> means "match within the () in the normal way, possibly backtracking,
> etc, but once you pass the closing ), that's it: no going back. That is
> also what (*PRUNE) means: when *PRUNE is encountered, all
> previously-remembered backtracking positions are forgotten.
>
> I like to think of this kind of matching as like a depth-first search of
> a tree of possibilities. If you use (?>...) atomic groups, they wrap up
> little bits of the tree so that the first way that is found through that
> bit is chosen, and cannot be changed, but you can jump right back over
> to a previous bit of the regex. What *PRUNE does is to "fix" the current
> path through the tree up to the current point. You can't go back and try
> any other paths.
>
>> this case the constant string). e.g.:
>> ((*SOMEOPT .+))\d+
>> run on 'aa1234' would give \1 = 'aa'
>
> That's what (.*?)\d+ would give if this is a standalone regex. But if
> it's part of something longer, and what followed failed, it could also
> set \1 to aa1, aa12, or aa123, while trying all the possibilities.
>
> But if you use (?>(.*?)\d+) then it is forced to stick with aa (which it
> finds first). This should be the same as (.*?)\d+(*PRUNE) in fact.
>
> Philip
>
> --
> Philip Hazel, University of Cambridge Computing Service.