[pcre-dev] [Bug 1562] intermittent segfault using grep -P

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1562] intermittent segfault using grep -P
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1562

Philip Hazel <ph10@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID





--- Comment #2 from Philip Hazel <ph10@???> 2014-12-31 11:50:04 ---
Graycode is absolutely correct. This is a "well-known" effect of the way PCRE
remembers backtracking points -- and your pattern has a backtracking point for
every character in the subject. If you increase the size of the system stack,
you will find that it can handle longer strings. Alternatively, you can compile
PCRE to use the heap instead of the stack, but this makes it run more slowly.

In the forthcoming release of a new API for PCRE (called PCRE2) the amount of
stack per iteration has been reduced - at least, we thought we had reduced it
until I started testing your pattern, when I found it had actually increased.
This turned out to be because gcc was inlining a function, and stopping that
has made the reduction work. For your pattern, on my Linux system, around
18,000 characters is the subject limit; with PCRE2 it is about 22,700. So your
report has been very useful in debugging PCRE2. Thank you!

Because stack frame sizes vary so much with compiler and optimization level,
the pcretest -m -C feature that Graycode refers to is no longer present in
PCRE2. Actually, it was probably a "kludge/hack" rather than a feature, and I
am not convinced it ever really worked.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email