Re: [pcre-dev] Using PCRE upon Asian and other two-byte nati…

Top Page
Delete this message
Author: Ze'ev Atlas
Date:  
To: Pcre-dev
Subject: Re: [pcre-dev] Using PCRE upon Asian and other two-byte national codings
Whatever you do, please take care not to break the EBCDIC support (which is now implemented in Exclusive OR mode to ASCII)
 
Ze'ev Atlas



________________________________
From: ND <nadenj@???>
To: Pcre-dev <Pcre-dev@???>
Sent: Sunday, November 24, 2013 6:01 AM
Subject: Re: [pcre-dev] Using PCRE upon Asian and other two-byte national codings


On 2013-11-23 16:07, ph10 wrote:
> On Sat, 23 Nov 2013, Zoltán Herczeg wrote:
>
> currently PCRE character tables can only hold lowercase / flipped case 
> and various type bits for the >first 256 characters. Supporting the 
> whole 64K character set in 16 bit mode would take 409600 bytes >of 
> memory, which is less than half megabyte. Today, even smartphones can 
> afford that cost. The trade->of would be that the same tables could not 
> be used in 8/16/32 bit modes anymore, since the >lowercase / flipped 
> case tables would depend on the natural character length. Hence a table 
> with only >256 characters would be bigger in 16/32 bit mode than now. 
> (Note: the table size would always be >divisible by 256. This would 
> allow not to change anything in 8 bit mode, but we could also support 
> >character sets which does not have 64K characters in 16 bit and 
> especially in 32 bit mode, where we >have 4096M characters).
> I am sure we cannot do this for 8.34 (this is not an easy task), but if 
> this is important for many >people, we might think about this later.


I think it will be useful.

Thanks.

--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev