Re: [pcre-dev] pcre_compile.c: error_texts

Top Page
Delete this message
Author: Kevin Connor Arpe
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] pcre_compile.c: error_texts
Philip, Zoltán,

Thanks for you feedback. Comments inline below.

> I wouldn't worry about the thread safety of your static array. If you

write the same values to the memory from different threads, that is
> safe. Therefore, if for some reasons the array would be initialized

multiple times, that should be thread safe (as long as you write
> exactly the same values to the array again).


I had not considered this scenario. Good point. I was so concerned about
multiple/parallel/re-init, I forgot the indexer would be identical.

> Alternatively, we can simply use something like:
> > static char* error_texts[] = { "text1", "text2", ... };
> > error_texts[i] would return with the text corresponding to error code.
>
> Isn't that what we are trying to avoid? I suspect that is what it used
> to be before the relocation issue was raised. This is the ChangeLog
> entry for release 7.4:
>


I agree. The comment carefully notes the advantages. I didn't want to
remove the big string -- only build an indexer at the first error.

    Thanks to the folks working on Gregex for glib for this insight.

>


How ironic: I am hacking the exact same library/module now. =)

All of this came about because I have an text box in a GUI app that can
accept regex from users. As the user types, the text is continually
recompiled into a GRegex (which calls PCRE under the hood). Errors are
reported back realtime (tooltip). I want to make sure this feedback is as
fast as possible.


>
> My own view about this is that I don't think any gain is worth the
> effort of making any change. After all, we hope that most of the time
> there are no compilation errors, so it doesn't really matter how long it
> takes to find an error message.
>


Ok. I will take your word. Also, Zoltán makes another good point: I/We
should try to test the impact of failed compiles before implementing a
patch.


> > What if we create a (static) list of offsets that is only initialised at
> > first compile error?
>
> Since pcre_compile() gives up on hitting a compile error, there are only
> ever "first" compile errors, so I'm not quite sure what the proposed gain
> is here.
>


Apologies, I should be clearer. By "first" I do not mean multiple errors
in the same pattern. I mean multiple, sequential calls to pcre_compile().
Imagine the scenario above where user is entering regex in a GUI. This
causes continuous recompile -- after each keystroke. Each recompile will
be different (probably), and may potentially fail. Another case: You have
a big file of regexes that you want to try to compile (test, etc).

So when I say "first", I literally mean the first time error_texts is ever
scanned (after PCRE lib is loaded into memory). At that point, we build
the the indexer. For subsequent compiles that fail, we will have faster
error lookup.

As for cathedrals and bazaars, until a couple of years ago I was the
> only maintainer, though over time plenty of people had sent in
> suggestions, patches, etc. which I scrutinized and usually adopted in
> some form or other. Nowadays Zoltán and Christian Persch have joined me
> in a small "team". It's all quite informal.
>


Understood. I am glad to know you are open to discussions/patches.

Thanks,
Arpe