Re: [pcre-dev] New API

Top Page
Delete this message
Author: ph10
Date:  
To: Carsten Klein
CC: pcre-dev
Subject: Re: [pcre-dev] New API
On Sat, 18 Jan 2014, Carsten Klein wrote:

> No, actually you didn't. Sorry, I mixed this up with the pcre2_version
> function. Since the size of the buffer needed for the version is easy to
> estimate, there seems not to be a problem with that function.


That raises a useful consistency point. I think I will change
pcre2_version so that it is similar to the extraction functions,
returning a positive count on success, or a negative error code
otherwise.

> In fact, that's what I meant, when I referred to "number of characters".


Oh good. There isn't really a good word for this concept; I've mostly
been using "data unit" in the documentation to mean "byte in the 8-bit
library, short (halfword) in the 16-bit library, or int (word) in the
32-bit library".

> Of course, a separate function for getting the length of the substring would
> be a suitable solution, too.
>
> However, in fact you need two such functions, one to pass a stringnumber and
> one to pass the name (and the caller needs to pass a pcre2_match_data, not the
> context):


Yes, two functions of course, and indeed, the match data - that was an
oversight. For the named one, the pattern will also be needed in order
to get the name/number table.

> For my mind, returning the substring length (or the required buffer size)
> instead of PCRE_ERROR_NOMEMORY, if the provided buffer is too small, directly
> from pcre2_copy_named_substring and pcre2_copy_substring is less costly (both
> for the caller and the implementor).


But if you do that, how does the caller know there has been an error? By
the fact that the returned length is greater than the given buffer
length? That means that the _get_ and _copy_ functions have different
error behaviour, and two tests are necessary in the _copy_ case:

1. Test for negative return (PCRE_ERROR_NOSUBSTRING).
2. Test for length > buffer length.

In the _get_ case, just testing for a negative return is enough. I think
I would rather keep them the same as each other, and also unchanged from
the current API.

Thanks for your comments.

Philip

--
Philip Hazel