I have a string that's 150,002 characters consisting of
P1P2......P4....P9.....
(Thats 'P1', followed by 50k each of 'P2', 'P4', 'P9')
and a simpleminded regex of
(P1(P2)*(P4)*(P9)*)?
With PCRE 7.9, this crashes in match() on the 5672'd recursive call (on a linux box with a moderately large stack)
With the old Spencer regex, we match successfully.
I tried the obvious things to improve this:
1/ replaced capturing parentheses with (?: )
2/ Added explicit ^$ anchors
3/ possessive quantifiers.
4/ atomic grouping
In all cases I still stomp the stack via recursion in match().
I assume that if I configured PCRE to not use the stack that I could make this run (although probably really slowly due to malloc/free overhead).
Any advice?