Re: [pcre-dev] A tweak for the NO_RECURSE option

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: Graycode
CC: pcre-dev
Subject: Re: [pcre-dev] A tweak for the NO_RECURSE option
On Mon, 12 Dec 2011, Graycode wrote:

> When PCRE is compiled with NO_RECURSE then it uses allocated heap
> storage instead of recursion with the stack.
>
> The majority of expressions that I use do not require recursion and
> perhaps neither does a large portion of the PCRE testdata suite.


I think you might have misunderstood this. The terminology is rather
badly chosen. "Recursion" here does not refer to recursion in the
patterns, but to the fact that the match() function calls itself
recursively while performing the match. This uses memory on the system
stack, and that can run out in environments that have a limited system
stack. Almost every pattern except the very simplest will use one or
more recursive calls. For example, the pattern /(ab)(cd)(ef)/ recurses 4
times (you can find this out by running pcretest with the -M option).

The NO_RECURSE feature is designed specifically not to put data on the
system stack. Instead of recursively calling match() it saves what it
needs on a heap stack, and jumps to the start of the function.

> --- pcre_exec.c  (version 8.21-RC1)
> +++ pcre_exec.c  (working copy)
> @@ -328,7 +328,8 @@
>    {\
>    heapframe *oldframe = frame;\
>    frame = oldframe->Xprevframe;\
> -  (pcre_stack_free)(oldframe);\
> +  if (oldframe != &frame_zero)\
> +    (pcre_stack_free)(oldframe);\
>    if (frame != NULL)\
>      {\
>      rrc = ra;\
> @@ -485,8 +486,9 @@
>  heap whenever RMATCH() does a "recursion". See the macro definitions above. */

>
> #ifdef NO_RECURSE
> -heapframe *frame = (heapframe *)(pcre_stack_malloc)(sizeof(heapframe));
> -if (frame == NULL) RRETURN(PCRE_ERROR_NOMEMORY);
> +heapframe frame_zero;

^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^
This statement is creating a frame *on the system stack*, which is
precisely the thing NO_RECURSE is trying to avoid doing.

Disclaimer: I'm writing this in a hurry. Apologies if I have
misunderstood you, but if all you are doing is saving one malloc for a
frame, I don't think it will apply to many regexes, if any.

Philip

--
Philip Hazel