Re: [pcre-dev] Here is pcre-7.1-RC1 for you to play with

Top Page
Delete this message
Author: Ralf Junker
Date:  
To: pcre-dev
Subject: Re: [pcre-dev] Here is pcre-7.1-RC1 for you to play with

>I have just put a first release candidate for 7.1 in


Works fine for me, especially the new *.generic files.

> Nowadays, the use of locales is
> going out of fashion with the rise of Unicode, and given the
> increasing international nature of everything, it might make sense to
> have a fixed default locale for PCRE (the "C" locale, presumably).


My I suggest to default for the ISO-8859-1 character set instead? It corresponds to the first 256 Unicode codepoints and is therefore fully compatible with PCRE in UTF-8 mode. the upper 128 codepoints allow caseless searches for most European languages like German Umlauts and French accented characters. ISO-8859-1 is also the default encoding for HTML.

For non-European languages, it would also be nice if a future version PCRE would have a callback function to convert case for codepoints greater than 255.

> This could be done by distributing the pcre_chartables.c file instead
> of generating it dynamically. (This does not prevent callers of
> PCRE from providing alternative tables at run time if they want to.)
>
> Having a fixed set of default tables would also get round the
> occasional problem in the tests that is caused by somebody building
> PCRE in an unusual locale.


I remember I once asked for a pcre_chartables.c for exactly this reason. On the other hand, it would not hurt to still have dftables.c available for all who like to generate the character tables at compile time instead of run time.

Ralf