[pcre-dev] [Bug 883] Crash with sample PHP-Script

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 883] Crash with sample PHP-Script
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=883

Philip Hazel <ph10@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID





--- Comment #2 from Philip Hazel <ph10@???> 2009-09-16 18:02:18 ---
I have extracted your pattern and string, and tried them directly on PCRE using
the pcretest program. Using PCRE 7.9 it does indeed crash, as you report.
However, when the stack size is increased, it works fine. So this is just a
typical "pattern that eats stack" problem. Interestingly, using the forthcoming
8.00 release, it just matches with the default stack (on Linux), but I guess
it's probably a marginal thing.

Your pattern is a typical one that not only eats stack, but is very
inefficient. Repeating a single character inside parens by repeating the parens
is not good news. It would work better and faster if you added a repeat to the
[] inside the parens. Even better would be a possessive quantifier, to make the
non-matching case more efficient. There is discussion about this kind of thing
in Friedl's book. Adding ++ after the final ] inside the parentheses, giving

/^(?:%[[:xdigit:]]{2}|[A-Za-z0-9-_.!~*'()\[\];\/?:@&=+$,]++)*$/

for the final pattern, when run on my box and timed using pcretest's rather
crude timing estimates, reduced the matching time from 5.83 milliseconds to
0.03 milliseconds.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email