[pcre-dev] [Bug 1134] pcre_fullinfo gives incorrect info for…

Top Page
Delete this message
Author: Richard Smith
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1134] pcre_fullinfo gives incorrect info for PCRE_INFO_STUDYSIZE
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1134




--- Comment #5 from Richard Smith <exim@???> 2011-08-15 10:31:31 ---
Thanks for the update. I haven't seen the doc update yet but here's what we
currently do - if you foresee any problems for us, per the comment below, could
you let me know? The talk of there being other pointers makes me a little
nervous!

We provide an APL interpreter and do take advantage of being able to
save/restore the pcre and pcre_extra blocks to maintain a (savable) cache of
compiled patterns. Our original design specified that we would not take the
trouble to study patterns when first encountered so as not to waste time with
patterns used only once, but if a pattern is reused from the cache we would
then study it if we had not yet done so. Because of problems we had with saving
the study data our current release does not actually study at all, but the
current development version does, now that we understand the data length /
pointer issue discussed here.

We treat the data returned by pcre_compile and pcre_study as far as possible as
"black boxes", passing (copies of) them to pcre_exec. We do not direcltly
manipulate them ourselves, except that before we call pcre_exec we now fix-up
the pcre_extra block with code of the form

pcre_extra *ex = ...;
ex->study_data = (char *)ex + sizeof (pcre_extra);

If this is insufficient because of there being other pointers, we need to
decide whether to do any more fixing up, or whether to remove the study phase
again, so any advice would be gratefully received!

> The fact that pcre_study() currently gets two blocks with one malloc()
> should not be relevant to the caller ... it is possible that it could change
> in future.


It is relevant, of course, inasmuch as it needs to know how to copy the data
when saving it. Our code currently assumes it is one contiguous block and
copies it as such, but perhaps we should do it in two sections in anticipation
of any change. I think any such change would have some user impact, though,
because there is currently only one call to pcre_free to release the data
again. Either that would need to change, or pcre_free would need to be clever
and work out what it needs to do. However, the docs allow pcre_free to be
overridden by the user:

> The global variables pcre_malloc and pcre_free initially contain the
> entry points of the standard malloc() and free() functions, respec-
> tively. PCRE calls the memory management functions via these variables,
> so a calling program can replace them if it wishes to intercept the
> calls.


So it looks like any such change would require some corresponding changes in
the calling code.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email