Re: [pcre-dev] optimizing matches for large strings

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: Alan Lehotsky
CC: pcre-dev
Subject: Re: [pcre-dev] optimizing matches for large strings
On Fri, 11 Feb 2011, Alan Lehotsky wrote:

> I have a string that's 150,002 characters consisting of
>
>         P1P2......P4....P9.....
>
>
> (Thats 'P1', followed by 50k each of 'P2', 'P4', 'P9')
>
> and a simpleminded regex of
>
>    (P1(P2)*(P4)*(P9)*)?


Have you tried (P1(P[249])*)? Actually, don't bother. I don't think it
will make any difference. The thing is, there is an internal recursive
call each time it enters a parenthesis.

> I assume that if I configured PCRE to not use the stack that I could
> make this run (although probably really slowly due to malloc/free
> overhead).


Well, it's still going to use the same amount of memory. So why not make
the stack even bigger instead?

If, as your comment suggests, you really know exactly what's in your
string, what are you using PCRE to try to find? Perhaps there's another
way of solving the problem.

Philip

--
Philip Hazel