Re: [pcre-dev] New API proposal revision

Top Page
Delete this message
Author: ph10
Date:  
To: Rv
CC: pcre-dev
Subject: Re: [pcre-dev] New API proposal revision
On Fri, 7 Mar 2014, Rv wrote:

> Then, as I just have discovered PCRE lib, I would congratulate all of you for
> this great job.


Thank you.

> To finish, I have seen the proposal for a V2 of the API, I could not resist to
> read it (quickly), without any knowledge of what has been already discussed
> here, but I prefer to give you my first (certainly not correct) impression.


Thank you for taking the time to read it and comment.

> - For all functions concerning context, I will prefer to always start the name
> of the function by pcre2_ctx_* or pcre2_context_*


I guess I used the names that I did because of English word order. For
example, "free context" is more natural than "context free" because
English puts the verb before the object.

It is not too late to change (but as I've started - slowly - working on
a framework, it is almost too late). However, if, for example,
pcre2_free_context changed to pcre2_context_free, then for consistency
we would also have to change pcre2_free_match_data to
pcre2_match_data_free and so on. In other words, group the names by the
object they work on instead of by the action they take.

I am inclined not to change - does anybody else on this list have views?

> - I prefer to use the word "alloc" in function name when allocation is done
> and reserve the word "init" to initialize structures. Thus pcre2_init_context
> will be pcre2_ctx_alloc (and pcre2_ctx_free)


pcre2_init_context *does* initialize the structure. It seemed to me that
it would be more convenient for a user to write

pcre2_context *c = pcre2_init_context(NULL);

instead of

pcre2_context *c = pcre2_init_context(pcre2_alloc_context());

> - Are you sure this is necessary to have memory_management functions
> (private_malloc/private_free) by context ? Same question for
> recursion_memory_management ?


Yes! That is one of the main reasons for making a new API. The fact that
the current private malloc/free functions are global variables means
that they cannot be different for different threads. The new API has no
global variables.

> - Is it necessary to have a pcre2_get_user_data()/pcre2_ctx_get_udata() as
> this information is given when private_malloc() and private_free() are called
> ?


An earlier draft of the proposal was different, but during previous
discussion people said there should be complete matching groups of
set/get functions for each field in the context, for consistency. That
is why these functions are now specified.

> - I will change all returns code of the API to have the following rules:
> o Allocation functions return as standard allocation functions, that is a
> pointer, NULL if failure
> o Other functions always return an int >= 0 if OK, < 0 if NOK (#define
> PCRE2_ERR_INVAL_XX => return(-PCRE2_ERR_INVAL_XX))


The error codes are very similar to the existing API. They don't seem to
have been a problem in the past, so I saw no reason to change them.

> Thus
> prce2cc = pcre2_compile_alloc(ctx, pattern, size, options, &errcode, &errpos);
>
> - Maybe prce2cc structure will be able to keep errors information and remove
> errcode et errpos in the call of compile_alloc.


pcre2_compile returns NULL for a compile-time error - there is no
structure.

> Then the test
> if(prce2cc == NULL){
>    pcre2_get_error_message(errorcode, buffer, 120);
> }
> will become something like:
> if(prce2_compile_failure(prce2cc)) {
>    errcode = prce2_get_errcode(prce2cc);
>    errpos = prce2_get_errpos(prce2cc);
>    prce2_get_errpos(prce2cc, buffer, 120);
> and/or
>    pcre2_strerror(errcode, buffer, 120) /* like standard strerror() */
> }


I think that is unnecessarily complicated - and you still have to free
the pcre2cc structure. The code at the moment does not even allocate a
structure until it has made a first-pass check through the regex, which
is where most compile-time errors are picked up.

> - I always use the fullinfo COUNTSIZE to get the size of the expected ovector
> size
> So I prefer to have something like
> match_data = pcre2_mdata_alloc(pcre2cc, 0); /* 0 to use fullinfo(COUNTSIZE), >
> 0 for a specify size*/


You need also to pass the context (for malloc/free), but I can see that
something like this could be a useful convenience. Thank you for the
idea.

> rc = pcre2_exec(ctx, prce2cc, array_subject, lower, upper, max, stride,
> option, match_data);
> I will use array_subject (a) lower (l), upper (u), max (m) and stride (s) to
> define slice as follow (if I had time, I will implement it for python slice):
> 8<------------------------------------------------------------------
> A bit long, sorry... You can skip if no interest


Er, yes, I didn't really follow all of that.

> A bit.... to much maybe....


:-)

Philip

--
Philip Hazel