Re: [pcre-dev] Matching string length limit

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: Michael
CC: pcre-dev
Subject: Re: [pcre-dev] Matching string length limit
On Fri, 11 Jan 2008, Michael wrote:

> My English is bad, sorry. But I'll try...


Seems pretty good to me. :-)

> I plan to use PCRE for operate with very long subject strings. In 99%
> cases I know beforehand, that my matching substring length can't
> exceed N symbols (N varies). This knowledge potentially very useful
> for matching optimization. How about adding to pcre_exec and
> pcre_dfa_exec possibility to accept N parameter and backtrack if
> current matching substring yet not match at all, but it length already
> exceeds N? Such behaviour may allow to greatly optimize matching
> process.


The latest version of PCRE includes some backtracking control features
from Perl 5.10 (which has been coming "real soon now" for rather a long
time). See "Backtracking control features" in pcrepattern.3. One of
these features allows you to force a failure, though you could always do
that with (?!) of course. However, I can see that these features may not
be easy to use to do what you want. And, of course, they do not apply to
pcre_dfa_exec because that does not backtrack.

Without looking at the code, my guess is that this would involve quite a
lot of work to add to pcre_exec, because the current point is adjusted
in many places, but not much work to add to pcre_dfa_exec, where the
current point is moved in only one place. Of course, a different calling
interface would be needed, in order to retain backwards compatibility,
and that would complicate things.

I have noted the request, but I do not promise that I will actually do
anything about it.

Philip

--
Philip Hazel