Re: [pcre-dev] PCRE is not case insensitive with Greek chara…

Top Page
Delete this message
Author: Nuno Lopes
Date:  
To: pcre-dev, Dimitrios
Subject: Re: [pcre-dev] PCRE is not case insensitive with Greek characters
>> Uppon further investigation, it is apparent that the PCRE library
>> requires some extra "logic", or more specificaly a table with the
>> extra Greek characters which are causing the problem.
>
> PCRE is case-insensitive for Greek characters if it is compiled with
> UTF-8 support and Unicode Property Support, and it is then run in UTF-8
> mode. Here is a simple test for this:
>
> PCRE version 7.8 2008-09-05
>
> /\x{391}/8i             <== the pattern, UTF-8, case-insensitive
>  \x{391}               <== a subject string
> 0: \x{391}             <== it matches
>  \x{3B1}               <== a second subject
> 0: \x{3b1}             <== it also matches

>
> The character U+391 is a Greek capital Alpha; the character U+3B1 is a
> lower case alpha. The test matches both. Without the /i option, it does
> not match the second string.
>
> I do not know how the version of PCRE that PHP uses is compiled.


PHP is built with unicode with properties by default.
I think the problem is that the original PHP test case has a typo in the
capitallized test string.

Nuno