Author: Philip Hazel Date: To: Michael CC: pcre-dev Subject: Re: [pcre-dev] Matching string length limit
On Fri, 11 Jan 2008, Michael wrote:
> My English is bad, sorry. But I'll try...
Seems pretty good to me. :-)
> I plan to use PCRE for operate with very long subject strings. In 99%
> cases I know beforehand, that my matching substring length can't
> exceed N symbols (N varies). This knowledge potentially very useful
> for matching optimization. How about adding to pcre_exec and
> pcre_dfa_exec possibility to accept N parameter and backtrack if
> current matching substring yet not match at all, but it length already
> exceeds N? Such behaviour may allow to greatly optimize matching
> process.
The latest version of PCRE includes some backtracking control features
from Perl 5.10 (which has been coming "real soon now" for rather a long
time). See "Backtracking control features" in pcrepattern.3. One of
these features allows you to force a failure, though you could always do
that with (?!) of course. However, I can see that these features may not
be easy to use to do what you want. And, of course, they do not apply to
pcre_dfa_exec because that does not backtrack.
Without looking at the code, my guess is that this would involve quite a
lot of work to add to pcre_exec, because the current point is adjusted
in many places, but not much work to add to pcre_dfa_exec, where the
current point is moved in only one place. Of course, a different calling
interface would be needed, in order to retain backwards compatibility,
and that would complicate things.
I have noted the request, but I do not promise that I will actually do
anything about it.