[pcre-dev] [Bug 1174] allow passing of pcre_{malloc, free, …

Top Page
Delete this message
Author: Graycode
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1174] allow passing of pcre_{malloc, free, stack_malloc, stack_free, callout} as parameters

On 2011-11-08 10:11, Philip Hazel wrote:

> > Basically, the only time this solution would be insufficient, is when
> > a library wants to use some special purpose allocation code, which
> > will probably be rare.
>
> Indeed, that is what I was thinking as well, especially as nobody as
> brought up this issue before. This is certainly the easiest solution,
> and it does not rule out doing something much more drastic later if that
> turns out to be necessary.


I think it would be great if PCRE could invent a pcre_app_config()
function whereby the application could specify its default
limitations and configuration options. It should include things like
the memory allocation / free vectors, match_limit_recursion,
match_limit, etc. These are all currently present in PCRE, either
as static variables or as members of the extra structure. All I'm
suggesting is that a pcre_app_config() could establish default
handing, and that new function could spread those settings back
out into static variables that are private in the library.

The documentation should state that this is for initialization,
not for repeated use on every compile or exec. As highlighted by
Gertjan it should be made clear that altering memory management
vectors by different threads sharing a common PCRE library will lead
to trouble. Having the memory management vectors be set by way of
a "config" function would help to highlight their intended usage.

I suggest not requiring that pcre_callout be set that way. I think
it's not-the-same kind of configuration option because it's more
likely to have a different value for different threads of an
application. Consider adding a pcre_callout call-back function
pointer as a member of the extra structure that the application
can assign, next to the callout_data pointer that's already there.

Trying to carry the memory management vectors through the PCRE code
by starting with a pcre_compile3() seems difficult and may be more
trouble than it's worth. The better approach (IMHO) may be to
specify pcre_callout as an option to pcre_exec() by way of the
extra structure where it can more easily be carried through to
the appropriate places in the PCRE code.

Keep in mind that the thread that invokes pcre_compile2() may not be
the same thread that will call pcre_exec() to use it. In our case
all the setups including compile() are done by one thread, and later
the exec() using the compiled expressions are done by multiple other
threads. Releasing the compiled expression is also done by the same
thread that compiled them. That could matter a lot depending on
whether threaded memory is like fork() or other.

Overall I think the rest of the current API is sufficient, though
having a pcre_free_compile() specifically getting a (pcre *) type
is a great idea.


By the way, we do make use of (and rely upon) the PCRE memory
management vectors. Our programs allocate and manage that memory
in ways that are good for the overall application, which is a
decided blend of what may be good for PCRE vs. everything else the
application does.

We build PCRE as a Windows DLL that is shared among multiple
programs, or multiple instances of the same program. They share
a common code base for the DLL but each program instance gets a
different (unique) image of the library's data. There is an option
to build DLL's with shared data but that's not generally used
because of security and the potential conflicts mentioned in the
original post of this topic.


Regards,
Graycode