Re: [pcre-dev] Some words about assertion docs

Top Page
Delete this message
Author: ph10
Date:  
To: Pcre-dev
CC: ND
Subject: Re: [pcre-dev] Some words about assertion docs
On Tue, 9 Jul 2019, I wrote:

> I have put this on the wish list, but until I look at the code, I have
> no idea whether it will be easy or straightforward to implement in the
> interpreter. I will try to investigate soon. If it turns out to be
> possible in the interpreter, it will up to Zoltan to decide whether to
> add it to JIT.


It turned out to be very easy to implement in the interpreter, but there
was quite a lot of necessary but straightforward work to add new opcodes
and process them in the various scans of compiled patterns. Also, the
documentation took some time.

I have done this work, and committed the patches. The new code supports
both (*napla: and (*naplb: and although I haven't been able to think of
a good example for the latter, here is a test example:

/(*naplb:(.)..|(.)...)(\1|\2)/
    abcdb
 0: b
 1: b
 2: <unset>
 3: b
    abcda
 0: a
 1: <unset>
 2: a
 3: a


For a non-atomic lookaround to be useful, it must change the captured
groups after a backtrack, and the rest of the pattern must refer to a
changed group. This means there is no point in even looking at the
pcre2_dfa_match() code, because DFA matching does not support capturing.
At the moment, there is no JIT support for these new lookarounds; JIT
compilation will fail.

It is quite possible that I have missed one or more places in the code
that need updating for the new opcodes. Please test this if you can.

Philip

--
Philip Hazel