Re: [pcre-dev] PCRE2: Unnecessary substring number checks?

Top Page
Delete this message
Author: ph10
Date:  
To: Ralf Junker
CC: pcre-dev
Subject: Re: [pcre-dev] PCRE2: Unnecessary substring number checks?
On Wed, 10 Dec 2014, Ralf Junker wrote:

> > Consider the
> > case when oveccount is large (say 100), and pattern contains, say, 5
> > capturing parentheses. Suppose a match operation passes through only
> > groups 1,2,3. In that case, the code that follows that comment will set
> > the vector for groups 4 and 5 to PCRE2_UNSET. However, what if
> > stringnumber is set to 20?
> >
> > Obviously, I should make that comment in pcre2_match.c a bit clearer.
>
> My assumption was that stringnumber 6 to 20 would also be set to PCRE2_UNSET.
> It was not clear to me from reading the comment that they are left untouched.


Quite. But the code beneath is quite clear. This is deliberate - it only
touches slots that are relevant to the current pattern. After all, the
ovector might be enormous.

> > int pcre2_substring_isset(match_data, stringnumber)
>
> Exactly.
>
> > ? Is this anything more than just a cosmetic version of
> > pcre2_substring_length_by{number,name}() ? For those two functions, we
> > could add the specification that sizeptr could be NULL, in which case
> > they are exactly "test for this group being set".
>
> The dedicated and properly named function would probably be easier to find by
> developers. And it would perform slightly faster. Maybe insignificantly so,
> but many small improvements add up.


Hmm. I don't think performance is an issue - in which case it could be
implemented as a macro wrapper.

Philip

--
Philip Hazel