Re: [pcre-dev] Atomic group optimizations issue

Top Page
Delete this message
Author: Zoltán Herczeg
Date:  
To: ND
CC: Pcre-dev
Subject: Re: [pcre-dev] Atomic group optimizations issue
Hi Naden,

I like that you always have some interesting questions :) It is great to talk about the internals and optimizations of PCRE.

Here is the answer to your question: in the second case, you see the effect of an optimization called "tail recursion". You can even see that in the first case, if you remove the capturing brackets around 1:

re> /(?>1|2|.)*?(3|4)/
data> \Mabcdefghijklmnopqrstuvwxyz

Minimum match() limit = 190
Minimum match() recursion limit = 3
No match

However, in your first case, PCRE is forced to use OP_ONCE instead of OP_ONCE_NC, and that opcode is more costly (in terms of both stack and runtime), since the interpreter have to restore all capturing brackets (JIT is a bit more clever here, it only restores the brackets inside the atomic block). It does not know that the capturing bracket will never match. At least not for this particular input.

The result is also different, if we put the capturing bracket around the dot:

re> /(?:1|2|(.))*?(3|4)/
data> \Mabcdefghijklmnopqrstuvwxyz

Minimum match() limit = 217
Minimum match() recursion limit = 29
No match

You can see an increase here is well.

Suggestion: if it is possible, don't use capturing brackets inside atomic blocks.

Regards,
Zoltan

ND <nadenj@???> írta:
>Good day!
>
>Here is two pcretest.exe listings:
>
>
>PCRE version 8.34-RC 2013-06-14
>/(?>(1)|2|.)*?(3|4)/
>\Mabcdefghijklmnopqrstuvwxyz
>Minimum match() limit = 217
>Minimum match() recursion limit = 55
>No match
>
>
>PCRE version 8.34-RC 2013-06-14
>/(?:(1)|2|.)*?(3|4)/
>\Mabcdefghijklmnopqrstuvwxyz
>Minimum match() limit = 217
>Minimum match() recursion limit = 3
>No match
>
>
>In listing 2 "(?>" is replaced by "(?:". And "Minimum match() recursion
>limit" unexpectedly reduces to 3 from 55.
>
>I guess this happens due some internal PCRE optimizations.
>Is there possibility to reduce "Minimum match() recursion limit" for
>atomic groups in "(?:"-way?
>
>
>Best regards
>
>--
>## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
>