Re: [pcre-dev] several messages

Top Page
Delete this message
Author: ph10
Date:  
To: ND
CC: Pcre-dev
Subject: Re: [pcre-dev] several messages
On Sat, 22 Jun 2019, ND via Pcre-dev wrote:

> Successfull match of "X*\z" means that PCRE says: X CAN be successfully
> repeated until the very end of subject (let's the match is "abc" for example).
> When we use "X*" we want to say: repeat X as much as it can.


Yes, but there is special treatment when X is a group that can match an
empty string. Clearly, repeatedly matching an empty string can never
reach the end of the subject.

PCRE and Perl choose to stop repeating the group when this happens. Your
example is

/(?:a|(?=b)|.)*\z/

I suppose an alternative choice for dealing with matching an empty
string would be to move on to the next alternative branch in the group
(that is, ".") when there is another branch to try, but that is not what
Perl chose to do.

> From previous sentence we know that X CAN be repeated until the very
> end of subject. So we expect matching "abc" too. But now PCRE says
> that match is "a" (in other words PCRE says: "X CAN'T be repeated
> until the very end of subject"). This contradicts with our
> expectations.


I understand what you are saying, but there is nothing I can do about
it. There must be plenty of examples where removing \z changes what is
matched. How about /[ab]*\z/ matched against "aaaxxxbbb"?

I'm afraid you will just have to live with this.

Philip

--
Philip Hazel