[pcre-dev] [Bug 947] Very weird bug.

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 947] Very weird bug.
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=947




--- Comment #1 from Philip Hazel <ph10@???> 2010-01-09 09:45:32 ---
Perl, both in versions 5.8 and 5.10, behaves in the same way as PCRE, and,
remember, the "PC" in PCRE is "Perl-compatible". I think I see what is
happening with your regex ^(xa|=?\1a){2}$ when matched against xa=xaaa. The
first time it tries to match the parentheses, it matches xa. The second time,
it does not find xa, so it tries for =?\1a. It matches the equals, then it
tries to match "what parentheses #1 matched", which is xa, so it does. Then it
finds the next a. However, it is NOT then at the end of the subject, so it
backtracks. The backtrack point is the optional =, so it tries to match without
matching the =. At this point it is again looking for "what parentheses #1
matched", and I guess what is happening is that that value is now set to "=xa"
from before the backtrack. So it matches, and then finds the final a. Removing
the ? of course changes this.

Although PCRE is compatible with Perl, I agree that this is probably a bug. It
is certainly unexpected behaviour. I am not sure how easy it is to fix in PCRE,
however, because I am not sure about any mechanism for saving and re-instating
parenthesis contents at backtracking points. I will have to study the code and
think about it. A Perl bug should also be reported.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email