[pcre-dev] [Bug 1619] zero-width negative lookahead with cap…

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1619] zero-width negative lookahead with capture not working
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1619




--- Comment #7 from Philip Hazel <ph10@???> 2015-04-24 11:09:26 ---
On Thu, 23 Apr 2015, Zoltan Herczeg wrote:

> Interestingly, if I add a | the pattern does not match in PERL as well:
>
> /(ba)a(?!(a)x|)\2(baac)/ does not match baaabaac
> /(ba)a(?!|(a)x)\2(baac)/ does not match baaabaac
>
> I suspect the first case is special cased in Perl in some way. Or this is just
> a bug.


You have now reminded me that it was this kind of anomalous Perl
behaviour that made me decide to specify that negative lookarounds do
not set captures in PCRE.

However, I don't think your example is anomalous because an empty
branch always matches and so a negative lookahead containing an empty
branch must always be false:

PCRE2 version 10.20-RC1 2015-03-11
/^(?!a)c/
cc
0: c

/^(?!a|)c/
cc
No match

Perl 5.020002 Regular Expressions

/^(?!a)c/
cc
0: c

/^(?!a|)c/
cc
No match

I misread the original pattern, thinking it was back referencing from
outside the lookahead to an inside capture. However, as Perl is always
changing, I am still going to play around with other examples in Perl to
see whether PCRE2 should change in this respect.

Philip


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email