[pcre-dev] [Bug 1251] Fails when recursive group comes befor…

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1251] Fails when recursive group comes before reference
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1251

Philip Hazel <ph10@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID





--- Comment #3 from Philip Hazel <ph10@???> 2012-05-26 18:06:06 ---
This is not a bug, but an example of how PCRE differs from Perl, which does
match. This is an extract from the pcrepattern man page:

----------------------------------------------------------------------
Recursion processing in PCRE differs from Perl in two important ways. In PCRE
(like Python, but unlike Perl), a recursive subpattern call is always treated
as an atomic group. That is, once it has matched some of the subject string, it
is never re-entered, even if it contains untried alternatives and there is a
subsequent matching failure.
----------------------------------------------------------------------

In your pattern, after it has matched '::', the fragment (?>(?2):)? is called.
Group 2 is ((?1)(?>:(?1)){0,4}) and group 1 matches a sequence of hex digits.
Thus, group 2 matches '0', then ':255' (that is, it matches the inner
subpattern once for the {0,4} repeat). However, as the next character is not a
colon, the whole of (?>(?2):) then fails to match. In Perl, there is can be a
recurse back into group 2 which can match (?>:(?1)) 0 instead of 1 times, but
in PCRE this does not happen, and so the whole match fails.

Changing the final ':' to '!' alters the entire behaviour, because group 2 then
matches ?>:(?1)) zero times right away.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email