Re: [pcre-dev] Using PCRE upon Asian and other two-byte national codings

Author: ND
Date:
To: Pcre-dev
Subject: Re: [pcre-dev] Using PCRE upon Asian and other two-byte national codings

On 2013-11-23 16:07, ph10 wrote:
> On Sat, 23 Nov 2013, Zoltán Herczeg wrote:
>
> currently PCRE character tables can only hold lowercase / flipped case
> and various type bits for the >first 256 characters. Supporting the
> whole 64K character set in 16 bit mode would take 409600 bytes >of
> memory, which is less than half megabyte. Today, even smartphones can
> afford that cost. The trade->of would be that the same tables could not
> be used in 8/16/32 bit modes anymore, since the >lowercase / flipped
> case tables would depend on the natural character length. Hence a table
> with only >256 characters would be bigger in 16/32 bit mode than now.
> (Note: the table size would always be >divisible by 256. This would
> allow not to change anything in 8 bit mode, but we could also support
> >character sets which does not have 64K characters in 16 bit and
> especially in 32 bit mode, where we >have 4096M characters).
> I am sure we cannot do this for 8.34 (this is not an easy task), but if
> this is important for many >people, we might think about this later.

I think it will be useful.

Thanks.

This message is part of the following thread:
	the complete thread tree sorted by date
	Zoltán Herczeg at
	Ze'ev Atlas at

Re: [pcre-dev] Using PCRE upon Asian and other two-byte nati…