Re: [pcre-dev] PCRE is not case insensitive with Greek chara…

Inizio della pagina
Delete this message
Autore: Philip Hazel
Data:  
To: Dimitrios
CC: pcre-dev
Oggetto: Re: [pcre-dev] PCRE is not case insensitive with Greek characters
On Thu, 12 Mar 2009, Dimitrios wrote:

> Uppon further investigation, it is apparent that the PCRE library
> requires some extra "logic", or more specificaly a table with the
> extra Greek characters which are causing the problem.


PCRE is case-insensitive for Greek characters if it is compiled with
UTF-8 support and Unicode Property Support, and it is then run in UTF-8
mode. Here is a simple test for this:

PCRE version 7.8 2008-09-05

/\x{391}/8i             <== the pattern, UTF-8, case-insensitive
  \x{391}               <== a subject string
 0: \x{391}             <== it matches
  \x{3B1}               <== a second subject
 0: \x{3b1}             <== it also matches


The character U+391 is a Greek capital Alpha; the character U+3B1 is a
lower case alpha. The test matches both. Without the /i option, it does
not match the second string.

I do not know how the version of PCRE that PHP uses is compiled.

Philip

--
Philip Hazel