[pcre-dev] [Bug 1883] Allow parsing of any context-sensitive…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1883] Allow parsing of any context-sensitive grammar by enabling non-atomic recursion
https://bugs.exim.org/show_bug.cgi?id=1883

--- Comment #1 from Philip Hazel <ph10@???> ---
PCRE had recursion quite some time before Perl did, but it was an addition to
the original design. It was implemented as an atomic call because that fitted
in with the way the code worked. Supporting non-atomic recursion would require
a complete re-design. Roughly, the explanation is that PCRE does not have the
ability to remember backup points inside a group that is active multiple times
simultaneously (because it's been called recursively). The same issue applies
to repeated groups. PCRE solves this by duplication: a group such as (A){2} is
compiled as (A)(A) and if A is complex, remembered backup points in the
individual groups represent backups in specific iterations. This does lead to
large memory use when someone writes (A){100000}, for example.

The original plan (remember, this was in 1997, when Perl 4 was current and
patterns were a lot simpler) was to avoid the use of malloc() in the matching
function (pcre_exec), for better performance. In some circumstances this
constraint has been broken, but in many cases it still applies. To implement
non-atomic recursions (and eliminate the duplication of iterated groups) some
new algorithm that involved some kind of explicit stack would be required. If I
were starting from scratch that is perhaps what I might do, but I do not think
it is likely that I will do anything now[*], though this issue, does, of
course, remain on the wish list.

[*] and definitely not in the 8.xx series, which is in "maintenance only"
state.

--
You are receiving this mail because:
You are on the CC list for the bug.