Hi;
Am Fri, 5 Oct 2012 16:58:54 +0100 (BST)
schrieb Philip Hazel <ph10@???>:
> > Are any of the other ones dangerous? Afaict not. So limiting
> > this new compile or runtime option's effect to (*UTF8) would be
> > enough.
>
> Of course, any application that is worried about this can itself
> check for the text (*UTF8) at the start of any user pattern that it
> passes on to PCRE. It could even use PCRE to do the check! To do it
> properly, quite a complicated pattern is needed because other
> settings such as (*CR) can precede (*UTF8) at the start of a pattern.
> Something like this should be quite efficient:
>
> ^(?:\(\*\w+\))*?\(\*UTF\d+\)
>
> It only needs to be used if the pattern begins with '(*', so for many
> patterns the extra check will be insignificant.
>
> There are only two bits left in the PCRE options definitions (out of
> 32), and I am rather reluctant to use one of them just for this check.
I agree that those 2 bits are too valuable to waste for this :-)
I think it'd be enought to just document that the application needs to
use pcre_fullinfo() to get the options after compiling the pattern, and
check if the UTF flag is set.
Regards,
Christian