Re: [pcre-dev] Proposal for a new API for PCRE

Góra strony
Delete this message
Autor: ph10
Data:  
Dla: Ralf Junker
CC: pcre-dev
Temat: Re: [pcre-dev] Proposal for a new API for PCRE
On Thu, 29 Aug 2013, Ralf Junker wrote:

> > By default, PCRE is compiled to use the system stack for recursive
> > function calls when matching patterns using the interpreter (not JIT)
> > with pcre2_exec(). In some environments, where the size of this stack
> > is limited, PCRE is often compiled to use heap storage instead. The
> > memory blocks that are used for this purpose are all the same size,
>
> Are these memory block all the same size for ONE individual pattern or
> for ALL different patterns as well? I realize they might change between
> different PCRE versions.


They are the same for all patterns, and yes, they might change between
PCRE versions but in practice I would be very reluctant to increase the
size. (An increase would imply in increase in the stack frame when PCRE
is compiled to use the system stack, and we know some environments have
problems with stack sizes.)

> > and are requested and freed in last-out-first-in order. A private
> > memory manager could implement this kind of usage more efficiently
> > than the general case; to make this possible, two further memory
> > management functions can be added to a context:
> >
> > pcre2_set_recursion_memory_management(context,
> > private_recursion_malloc, private_recursion_free);
>
> Would it be useful to add private_recursion_begin() and
> private_recursion_end() callbacks so that
>
> a) Applications can initialize their recursion memory manger and
>
> b) optionally cleanup after matching


Surely an application can detect a first-time call itself? And, since
these blocks are always last-get-first-free you can detect when the
first block is freed. This is such a specialist feature that I am not in
favour of any more elaboration.

> I would appreciate 'get' functions for ALL properties.


Noted.

> I would prefer option B or C as it frees applications from figuring out the
> correct memory size of ovector.


I don't understand your comment ... the size is given explicitly for
both B and C, so the application must give it. (If an application wants
to get a precise size for ovector, it can use an info call to find out
the highest capture number, but many applications that are processing
more than one regex just allocate a largish ovector and use it for all
of them.)

> The memory is then also maintained by the
> pcre2_match_data and automatically freed if that structure is destroyed.


That is true, though many applications allocate ovector on the system
stack, which also avoids explicit freeing.

Philip

--
Philip Hazel