Remember that pcre_exec() returns whichever of a full match or a hard
partial match it finds first.
On Sun, 22 Jan 2012, ND wrote:
> PCRE version 8.21 2011-12-12
> /(?<=a)(?!b)/+
> \P\Pa
> Partial match: a
The lookbehind works; in the lookahead a partial match is forced because
the next character is not available *and* at least one character has
been inspected.
> Now we swap assertions:
>
> PCRE version 8.21 2011-12-12
> /(?!b)(?<=a)/+
> \P\Pa
> 0:
> 0+
In this case a partial match is not forced in the lookahead because no
characters have been inspected. So the lookahead succeeds and the rest
of the pattern matches.
> Another example:
>
> PCRE version 8.21 2011-12-12
> /(?!a)/+
> \P\Pa
> 0:
> 0+
Same thing. A partial match can never be an empty string.
On Sun, 22 Jan 2012, Zoltán Herczeg wrote:
> But you want a complete behaviour change for your specific use case
> which would break compatibility, although they have some reasons.
Indeed.
> As this would be a compatibility breaker new feature, we should
> probably aim for 8.31 The best thing would be to open a bug and
> discuss this new behaviour. And let other people tell their opinion
> which usually takes time...
>
> If I summarize what you said so far. If hard partial matching is enabled:
> - \z (and perhaps \Z and $) must never match at the end of the string
> - Match must not allowed at the end of the subject string
If a new feature is added (or the behaviour of hard partial is changed),
the only possibility would be to return no match.
A thought: we already have the PCRE_NOTEOL flag, but it is documented
like this:
PCRE_NOTEOL
This option specifies that the end of the subject string is not the
end of a line, so the dollar metacharacter should not match it nor
(except in multiline mode) a newline immediately before it. Setting
this without PCRE_MULTILINE (at compile time) causes dollar never to
match. This option affects only the behaviour of the dollar
metacharacter. It does not affect \Z or \z.
I wonder why it does not affect \Z or \z? I further wonder if it should
be made to affect \Z and \z when hard partial matching is happening?
Philip
--
Philip Hazel