Re: [pcre-dev] [Bug 1174] allow passing of pcre_{malloc, fre…

Top Page
Delete this message
Author: Graycode
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] [Bug 1174] allow passing of pcre_{malloc, free, stack_malloc, stack_free, callout} as parameters

> > I think it would be great if PCRE could invent a pcre_app_config()
> > function whereby the application could specify its default
> > limitations and configuration options. It should include things like
> > the memory allocation / free vectors, match_limit_recursion,
> > match_limit, etc. These are all currently present in PCRE, either
> > as static variables or as members of the extra structure. All I'm
> > suggesting is that a pcre_app_config() could establish default
> > handing, and that new function could spread those settings back
> > out into static variables that are private in the library.
>
> I am not very knowledgeable about threads, but it seems to me that this
> would not work, at least not in a Unix/Linux world (which is where I
> operate) because the static variables would be shared by all threads.
> Unless I am missing something (and that may well be true!) there is no
> concept of "static variables that are private to the library" in
> Unix/Linux. (I also suspect that in most Unix/Linux systems PCRE is
> installed as a shared library.)


It seems I may have misunderstood the issues at hand because of my
unfamiliarity with the technical meaning of "installed as a shared
library". If that kind of library installation would share a single
instance of the library's live data across multiple programs then
that's beyond my ability to comprehend or suggest a safe solution.


> I do already have an item on the Wish List that reads as follows:
>
> . Write a wrapper to maintain a structure with specified runtime
> parameters, such as recurse limit, and pass these to PCRE each time
> it is called. Also maybe malloc and free.
>
> In a threaded world, such a wrapper would have to keep the data in
> thread-local storage, possibly passed as an argument. Not really sure
> how this would work.


As I touched upon earlier, I can't visualize the need for them being
thread-local storage unless a shared libary exposes a single shared
common data instance across all programs using that libary. Library
implementations of that kind would be very difficult to use safely.

Your idea sounds good, and I must appologise because I thought that
PCRE had storage variables (which I referred to as statics) for
configuration data such as the match_limit. In reviewing the code,
that is not currently the case.

Below is a representation of the kind of app_config() that I suggested.
But without PCRE having storage variables for some of the options,
the utilization of the concept is limited to the storage represented by
the memory allocation / free vectors. In this illustration I'm
suggesting two functions: One that populates the current configuration,
and another that modifies it.

The underlying method by which PCRE implements a configuration option
need not be not exposed to the application. Using internal storage
variables within the library makes sense, but those variables can be
made private and not exposed. At some future version the pcre_malloc
and similar vectors could become invisible, no longer exported and no
longer manipulated directly by application code. If there comes a
future desire or need for PCRE to store some configuration option in a
thread-local or otherwise custom structure allocation then so be it.
Well that's the basic idea anyway.


> > By the way, we do make use of (and rely upon) the PCRE memory
> > management vectors.
>
> I suspected that somebody might; it's good to know that the facility is
> used. That is more of an incentive to improve it if possible.


In our particular case memory is tightly managed, stack is tighter,
and clock time needs to be minimized. We'd rather abandon a RegEx
search than spend too much trying to match an ill-conceived "wild"
expression. Our threads that use PCRE are created using only 32K
total stack size. That's probably your worst-case scenario but PCRE
has been able to do the job. Please pat yourself on the back.


Thanks,
Graycode


/*----------------------------------*/
/* the following would be in pcre.h */

typedef struct pcre_app_cfg_info {
#ifndef VPCOMPAT
void *(*cfg_malloc)(size_t);
void (*cfg_free)(void *);
void *(*cfg_stack_malloc)(size_t);
void (*cfg_stack_free)(void *);
#else /* VPCOMPAT */
void *cfg_malloc(size_t);
void cfg_free(void *);
void *cfg_stack_malloc(size_t);
void cfg_stack_free(void *);
#endif /* VPCOMPAT */
/* PCRE does not currently have storage variables for the following */
unsigned long int match_limit_recursion;
unsigned long int match_limit;
/* a growth area for future structure size compatibility */
long int future_growth_area [12];
} pcre_app_cfg_info;

PCRE_EXP_DECL void pcre_get_app_config(pcre_app_cfg_info *);
PCRE_EXP_DECL void pcre_set_app_config(pcre_app_cfg_info *);


/*---------------------------------------------*/
/* the following would be new code within PCRE */

PCRE_EXP_DEFN void
pcre_get_app_config(pcre_app_cfg_info * app_cfg)
{
app_cfg->cfg_malloc = pcre_malloc;
app_cfg->cfg_malloc = pcre_free;
app_cfg->cfg_stack_malloc = pcre_stack_malloc;
app_cfg->cfg_stack_free = pcre_stack_free;
/* PCRE does not currently have storage variables for the following */
app_cfg->match_limit_recursion = MATCH_LIMIT_RECURSION;
app_cfg->match_limit = MATCH_LIMIT;
}

PCRE_EXP_DEFN void
pcre_set_app_config(pcre_app_cfg_info * app_cfg)
{
  if (app_cfg->cfg_malloc && app_cfg->cfg_malloc != NULL)
     pcre_malloc = app_cfg->cfg_malloc;
  if (app_cfg->cfg_free && app_cfg->cfg_free != NULL)
     pcre_free = app_cfg->cfg_malloc;
  if (app_cfg->cfg_stack_malloc && app_cfg->cfg_stack_malloc != NULL)
     pcre_stack_malloc = app_cfg->cfg_stack_malloc;
  if (app_cfg->cfg_stack_free && app_cfg->cfg_stack_free != NULL)
     pcre_stack_free = app_cfg->cfg_stack_malloc;
  /* PCRE does not currently have storage variables for the following */
  /* !N/A! = app_cfg->match_limit_recursion;
     !N/A! = app_cfg->match_limit;
  */
}



/*----------------------------------------------------*/
/* the following is what an application could contain */
void my_pgm_or_thread_initialization( void )
  {
  pcre_app_cfg_info my_info;
  memset (my_info, 0, sizeof(my_info));
  pcre_get_app_config (&my_info);       /* retrieve previous or default settings */
  my_info.cfg_malloc = my_own_malloc_function;
  my_info.cfg_free = my_own_free_function;
  my_info.match_limit_recursion = 75;
  my_info.match_limit = 50000;
  pcre_set_app_config (&my_info);       /* assign new settings */
  }