Re: [pcre-dev] Partial match at end of subject

Top Page
Delete this message
Author: ph10
Date:  
To: ND
CC: Pcre-dev
Subject: Re: [pcre-dev] Partial match at end of subject
On Sat, 13 Jul 2019, ND via Pcre-dev wrote:

> At its core \z is positive lookahead assertion that want to inspect next
> character of subject.


I must admit I had not thought of it like that. I considered it just to
be "are we at the end of the subject?".

> I propose following algorithm (for PARTIAL_HARD only disregarding the existence
> of PARTIAL_SOFT):
>
> . Are we at the end of the subject? If no, backtrack
> . Is partial hard matching allowed? If no, continue matching
> . Have we inspected any characters? If yes, return a partial match Else
> return "no match"


I have been experimenting with trying this out. It "fixes" your first
example:

/\z/
abc\=ph
No match

Your third example is not a partial matching situation:

/c*/aftertext
ab\=ph
0:
0+ ab

This has found a complete match right at the start of the subject. It
has not hit the end of the subject. However,

/c*/aftertext
ab\=ph,offset=2
No match

Whereas before this would have given a complete match.

Your second example still gives a full match.

/(?!\C)/aftertext
ab\=ph
0:
0+

The reason is that the testing happens inside the assertion, so "no
match" means "assertion is true".

I am still not entirely convinced this change should be made. Zoltán,
what do you think? It would involve making changes to JIT, of course.

Philip

--
Philip Hazel