Re: [pcre-dev] PCRE suggestion

Top Page
Delete this message
Author: xgl001
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] PCRE suggestion
Hello Philip and others -

Thank you very much for your earlier reply.
I do hope you don't mind more chatter from me.

Something I would like to see in future PCRE is an API call to
establish run-time settings per the calling program.

For example, the default stack size for a Windows thread is 1 Meg.
I hand-calculated the stack usage of match() to be about 340 bytes
with SUPPORT_UCP and SUPPORT_UTF8 defined. Therefore a default
Windows program should expect a stack fault within 3000 recursions.
With consideration of other program stack usage, the fault is
likely at closer to 2500 recursions.

Hence the pcretest program aborts with a stack fault in Windows
at less than 3000 recursions doing "testinput2". That happens
unless it's built with higher-than-default stack size, or unless
the MATCH_LIMIT_RECURSION is sufficiently low to yield PCRE
Error -21 prior to a fault.

To minimize the likelihood of faults, the CONFIG.H for my build
of a PCRE DLL component has assigned:

#define MATCH_LIMIT_RECURSION 2000

2000 is too few for some, yet that limit is defined when the PCRE
component (static library or DLL) is built, and not when the
application using the component is built.

What seems missing is a simple but standard method for an
application program to indicate its PCRE desires at run time.

In my application code I'm trying to account for the stack size
that some dynamic threads are using, and in this case they have
far less than the 1 Meg default. For every call of pcre_exec()
I'm having to pass a PCRE_EXTRA structure with the appropriate
bit and value to set a PCRE recursion limit.

my_PcreExtra->flags |= PCRE_EXTRA_MATCH_LIMIT_RECURSION;
my_PcreExtra->match_limit_recursion = my_PCRE_MaxRecursion;

For some complicated expressions, the code may have previously
invoked pcre_study() which may produce its own pcre_extra.
In that case the recursion limit needs to be assigned to that
other pcre_extra when it's available.

That's a bit cumbersome. It would be handy to have a "standard"
PCRE API function that could be called at start-up or whenever
the program wanted to change its limit of MATCH_LIMIT_RECURSION,
MATCH_LIMIT, and perhaps any other run-time settings that could
be stored as internal global variables within PCRE.

That new API might also be a good choice for an application to
specify the pcre_malloc and pcre_free vectors when it chooses to
use its own methods for those. By the way, I think those are a
fantastic feature.


Best Regards,

Guy.

----- Original Message -----
From: "Philip Hazel" <ph10@???>
To: "xgl001" <xgl001@???>
Cc: <pcre-dev@???>
Sent: Wednesday, July 08, 2009 3:59 AM
Subject: Re: [pcre-dev] PCRE suggestion


> On Tue, 7 Jul 2009, xgl001 wrote:
>
> > I have a suggestion related to the Posix interface to add an optional
> > value for UnGreedy. Below I've suggested using "REG_X_UNGREEDY" with
> > the "_X_" implying it's non-standard.
>
> When I first implemented the Posix interface, I intended to keep it
> strictly standard, but I was persuaded later to add REG_DOTALL and
> REG_UTF8, which are non-standard. So I suppose I cannot really argue
> against REG_UNGREEDY, and I don't think we need the _X_ because we
> already have two non-standard options.
>
> Thank you for your contribution. I am expecting to look at the code of
> PCRE again in a month or two, and I will check out your patch at that
> time.
>
> > If this email is not a good method to send this content then please
> > let me know what forum or other method is best to communicate. The
> > PCRE bug tracker at exim.org did not seem appropriate for mentioning
> > non-bugs.
>
> In fact, people do use Bugzilla sometimes. You can mark your entry with
> severity "wishlist". However, posting to this list is also OK.
>
> > If desired, I can also supply MAK file and some information for
> > building PCRE as a native Windows DLL using Microsoft's Visual C++
> > (version 6). If not desired by your group, I will try to find another
> > way of documenting that.
>
> I am not a Windows user. The Windows users in this group seem happy with
> the existing CMake or "configure" build schemes, but perhaps they do not
> use Visual C++ v6. I would be happy to add your information to my
> NON-UNIX-USE file if people think it would be useful.
>
> Anybody else on the list like to comment on this?
>
> Philip
>
> --
> Philip Hazel