Re: [pcre-dev] Internal errors and crashes with quantified …

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: ND, Graycode
CC: Pcre-dev
Subject: Re: [pcre-dev] Internal errors and crashes with quantified subroutines
On Sun, 20 Nov 2011, ND wrote:

> PCRE version 8.20 2011-10-21
> /(a)(?2){0,1350}?(b)/
> Failed: internal error: overran compiling workspace at offset 16


> If '(?2)' replaces with '(?1)' than no problem happens in all cases.


That's because the problem is caused by the forward reference. The
backward reference (?1) is handled differently. PCRE is a one-pass
compiler, and forward references have to be remembered until the group
that is referenced is reached. The problem was that there was no check
on the space used for this memory. I have committed a fix; you may not
like it ... it replaces the error above with "too many forward
references". The solution is not to use forward references. Your pattern
can be rewritten like this:

/(a)(?(DEFINE)(b))(?2){0,1350}?(?2)/

> PS I'm not well experienced at english but is there 'overran' in error message
> must be 'overrun'?


No; it is short for "the compiling function overran the workspace", that
is, I'm using it as a past tense verb, rather than "there was an
overrun", but thanks for querying it.

On Mon, 21 Nov 2011, Graycode wrote:

> In case it helps Phillip or someone else, below shows the compiled
> expression sizes as determined by calling pcre_fullinfo() with
> PCRE_INFO_SIZE.
>
> /(a)(?2){0,1995}?(b)/ == 31981
> /(a)(?2){0,1996}?(b)/ == 31997
> /(a)(?2){0,1997}?(b)/ == 32013
> /(a)(?2){0,1998}?(b)/ == 32029
>
> /(a)(?2){0,1999}?(b)/ == Error(52) at offset 16 internal error: overran compiling workspace
> ... same for 2000 thru 2048 ...


Thanks for supplying that; it shortened the time I needed to figure out
what was going wrong (my failures were around 1999 as well).

> Section 'COMPILED PATTERN MEMORY USAGE' of PCREPERFORM(3) describes
> how portions of patterns are repeated, and in this case that operation
> is multiplying the impact of the issue.


Exactly!

Philip

--
Philip Hazel