Re: [pcre-dev] PCRE release 7.5

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: Wincent Colaiuta
CC: pcre-dev
Subject: Re: [pcre-dev] PCRE release 7.5
On Thu, 10 Jan 2008, Wincent Colaiuta wrote:

> Compiles and works out the box on Mac OS X 10.5.1. There is one locale-
> related failure, which as noted is almost certainly not a bug in PCRE.
> The same failure occurred with 7.4. Any suggestions on how I could
> investigate this failure and get it passing?
>
> Test 3: locale-specific features (using 'fr_FR' locale)
> --- ./testdata/testoutput3    2007-07-30 13:20:48.000000000 +0200
> +++ testtry    2008-01-10 19:08:05.000000000 +0100
> @@ -150,7 +150,7 @@
>   ------------------------------------------------------------------
>           Bra
>           [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff]
> -        [a-z\xb5\xdf-\xf6\xf8-\xff]
> +        [a-z\xaa\xb5\xba\xdf-\xf6\xf8-\xff]


The difference is that \xaa and \xba are considered to be [:lower:] in
your French locale and not in mine. These characters are "feminine
ordinal indicator" and "masculine ordinal indicator", and they consist
of a small underlined letter ('a' and 'o', respectively). In Unicode,
they are indeed flagged as lower case letters, but there is obviously
some disagreement in the (non-Unicode) locales.

I am not sure there is anything that can be done about this. I am also
not sure if anything should be done about it, as the world does seem to
be slowly moving away from single-byte character sets. So if we wait
long enough, the problem will go away. :-) Maybe.

Philip

--
Philip Hazel