[pcre-dev] [Bug 1883] New: Allow parsing of any context-sens…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
New-Topics: [pcre-dev] [Bug 1883] Allow parsing of any context-sensitive grammar by enabling non-atomic recursion, [pcre-dev] [Bug 1883] Allow parsing of any context-sensitive grammar by enabling non-atomic recursion
Subject: [pcre-dev] [Bug 1883] New: Allow parsing of any context-sensitive grammar by enabling non-atomic recursion
https://bugs.exim.org/show_bug.cgi?id=1883

            Bug ID: 1883
           Summary: Allow parsing of any context-sensitive grammar by
                    enabling non-atomic recursion
           Product: PCRE
           Version: N/A
          Hardware: All
                OS: All
            Status: NEW
          Severity: wishlist
          Priority: medium
         Component: Code
          Assignee: ph10@???
          Reporter: bobwei9@???
                CC: pcre-dev@???


Essentially: Remove this restriction from PCRE

       Recursion  processing  in PCRE differs from Perl in two important ways.
       In PCRE (like Python, but unlike Perl), a recursive subpattern call  is
       always treated as an atomic group. That is, once it has matched some of
       the subject string, it is never re-entered, even if it contains untried
       alternatives  and  there  is a subsequent matching failure.


It leads to unparseable context-sensitive grammars like:

{ S -> 0B1 | 1B0 | 1C1 | 0C0,
A -> 0 | 1,
B -> ABA | D1,
C -> ACA | D0,
D -> _E | _ ,
E -> AE | A }

This is essentially adding two bits at a same offset of two binary numbers and
checking if it matches the trailing bit. [The actual goal was matching correct
additions of two binary numbers in form of a + b = c where a, b and c each
match [01]+.]

I wished this could be expressed as:

(?(DEFINE)
(?<s> 0(?&b)1 | 1(?&b)0 | 1(?&c)1 | 0(?&c)0 )
(?<a> 0 | 1 )
(?<b> (?&a)(?&b)(?&a) | (?&d)1 )
(?<c> (?&a)(?&c)(?&a) | (?&d)0 )
(?<d> _(?&e) | _ )
(?<e> (?&a)(?&e) | (?&a) )
)
^(?&s)$

Example matches:
0_00
01_101

Example non-matches:
0_01
100_00011


In case this is declined, _why_ is PCRE exhibiting that behavior?

--
You are receiving this mail because:
You are on the CC list for the bug.