Re: [pcre-dev] A native pcre exec for JIT

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: Giuseppe D'Angelo
CC: pcre-dev
Subject: Re: [pcre-dev] A native pcre exec for JIT
On Sun, 30 Sep 2012, Giuseppe D'Angelo wrote:

> It would probably be the best to use opaque pointers. This way, you
> can change its memory layout at any time (f.i. adding new fields)
> without breaking ABI compatibility and without having to allocate
> extra memory in a (opaque?) pointer inside the pcre_context structure.
> One could either request one to be allocated by PCRE, or invoke a PCRE
> function to know the size in bytes of the structure, size to be passed
> to malloc or a custom allocation routine.


I don't think the ABI compatibility is an issue: just like the existing
pcre_extra block, it can be documented that the order of the fields in
the structure is not fixed and may change between releases. However,
that's a small point.

More importantly, using an opaque pointer means that functions will have
to be supplied to set and read the values in the structure's fields. As
there will be more than one type (pointers, ints, etc) there would have
to be either:

(1) A single function using a (void *) argument to get/set values; or
(2) Different functions for different data types.

I feel this gets quite clumsy. For example, two of the fields will be
match_limit and match_limit_recursion. Setting them using (1) would need
code like this:

long int match_limit = 10000;
long int match_limit_recursion = 5000;
pcre_set_context(pc, PCRE_CONTEXT_MATCH_LIMIT, &match_limit);
pcre_set_context(pc, PCRE_CONTEXT_MATCH_LIMIT_RECURSION, &match_limit_recursion);

Or, using (2):

pcre_set_context_long_int(pc, PCRE_CONTEXT_MATCH_LIMIT, 10000);
pcre_set_context_long_int(pc, PCRE_CONTEXT_MATCH_LIMIT_RECURSION, 5000);

with several other functions for other types of value. Reading the
values would require an equivalent set of functions.

Isn't it much neater to let the user write

pc->match_limit = 10000;
pc->match_limit_recursion = 5000;

and read their values similarly? Note that this proposal does not
replace pcre_extra: that would still be returned by pcre_study(), but it
would contain only opaque study and JIT data. Indeed pcre_extra could
become totally opaque (as it was in the original PCRE).

We could insist in the API that pcre_init_context() *must* always be
called before setting any values. This would ensure that any new fields
in a new release were suitably initialized with defaults.

But this is just my view...

Philip

--
Philip Hazel