Re: [pcre-dev] Ungreedy quantification in atomic groups

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] Ungreedy quantification in atomic groups
On Tue, 1 May 2007, Sheri wrote:

> Sheri wrote:
> > Is there any performance benefit to using an atomic group that includes
> > alternates with ungreedy quantification, e.g., something like this?
> >
> > (?>\\\\|\\t|\\Q.+?\\E|\\$(?:U|u|L|l|T|t)?\\d{1,3})
> >
> > Or would it be just as well (or better) to omit the (?>
> >
> > Thanks,
> > Sheri
> >
> >
> >
> Hi Philip, any idea on this?


Why don't you try some timing tests? That's what I would do if I needed
to know the answer. The pcretest program has a handy -t option.

Speaking without having done any tests, if that is a complete regex, I
can't see any point to an atomic group. The reason one uses atomic
groups is to stop backtracking into them when something later in the
regex fails. As your entire regex is an atomic group, there is no
possibility of that happening.

If that is only part of a regex, then using an atomic group may change
what is matched. I'm not sure I understand the regex. I assume the
backslashes have been doubled, otherwise it doesn't make sense. So
really you have

(?>\\|\t|\Q.+?\E|\$(?:U|u|L|l|T|t)?\d{1,3})

but that still doesn't make too much sense because \Q.+?\E matches the
literal string ".+?". So where is the ungreedy qualifier?

Philip

--
Philip Hazel, University of Cambridge Computing Service.