Re: [pcre-dev] Ungreedy quantification in atomic groups

Top Page
Delete this message
Author: Sheri
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] Ungreedy quantification in atomic groups
Philip Hazel wrote:
> On Tue, 1 May 2007, Sheri wrote:
>
>
>> Sheri wrote:
>>
>>> Is there any performance benefit to using an atomic group that includes
>>> alternates with ungreedy quantification, e.g., something like this?
>>>
>>> (?>\\\\|\\t|\\Q.+?\\E|\\$(?:U|u|L|l|T|t)?\\d{1,3})
>>>
>>> Or would it be just as well (or better) to omit the (?>
>>>
>>> Thanks,
>>> Sheri
>>>
>>>
>>>
>>>
>> Hi Philip, any idea on this?
>>
>
> Why don't you try some timing tests? That's what I would do if I needed
> to know the answer. The pcretest program has a handy -t option.
>
> Speaking without having done any tests, if that is a complete regex, I
> can't see any point to an atomic group. The reason one uses atomic
> groups is to stop backtracking into them when something later in the
> regex fails. As your entire regex is an atomic group, there is no
> possibility of that happening.
>
> If that is only part of a regex, then using an atomic group may change
> what is matched. I'm not sure I understand the regex. I assume the
> backslashes have been doubled, otherwise it doesn't make sense. So
> really you have
>
> (?>\\|\t|\Q.+?\E|\$(?:U|u|L|l|T|t)?\d{1,3})
>
> but that still doesn't make too much sense because \Q.+?\E matches the
> literal string ".+?". So where is the ungreedy qualifier?
>
> Philip
>
>

oops, should have been \\Q.+?\\E in there. And there would be many more
alternates in the real pattern, but those are the only ones with
question marks and there are no stars.

Regards,
Sheri