Re: [pcre-dev] match point reset bug?

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: Craig Silverstein
CC: pcre-dev
Subject: Re: [pcre-dev] match point reset bug?
On Fri, 23 Oct 2009, Craig Silverstein wrote:

> OK, I had a free moment to look into this, and the comment in
> pcredemo.c is pretty clear (and the following code is too). I don't
> think this will be too hard to add to the C++ wrapper, though you
> never know.


Good!

> I'll use shari's example string to test, involving
> GlobalReplace("abc\K|def\K") on "abcdefghi".  But what is the exact
> test to do?  Also, is there another good string for testing, that does
> not involve \K?  Maybe something like
>    GlobalReplace("aa|b*", "!", "aaaaa")
> ?

>
> What is the proper output in that case?


That's a good test. Using pcretest (which also implements the magic
algorithm) I get:

$ ./pcretest
PCRE version 8.00 2009-10-19

re> /aa|b*/g+
data> aaaaa

0: aa
0+ aaa
0: aa
0+ a
0:
0+ a
0:
0+

This shows that it matches, the first two characters "aa" (with three to
follow; that's what the "0+ aaa" line means), then the next two
characters "aa"; then it fails to match aa, so it matches an empty
string (with just "a" left over), after which it moves on and matches an
empty string at the end. So if you did a global replace, the output
should be "!!!a!". As indeed Perl confirms:

$ perl -e '$x = "aaaaa"; $x =~ s/aa|b*/!/g; print "$x\n";'
!!!a!

Philip

--
Philip Hazel