[pcre-dev] [Bug 1049] Add support for UTF-16

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Old-Topics: [pcre-dev] [Bug 1049] New: Add support for UTF-16
Subject: [pcre-dev] [Bug 1049] Add support for UTF-16
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1049




--- Comment #18 from Philip Hazel <ph10@???> 2011-11-14 15:24:07 ---
On Mon, 14 Nov 2011, Thorsten Schöning wrote:

> If I understood you correctly, I'm the one handling both sizes. :-) In one of
> my applications we worked with standard char in windows-1252 codepage and
> needed to support Unicode in some places and decided to use what Windows
> supports per default, which is wchar_t as 16 Bit datatype with UTF-16 encoding.
> This resulted in classes which work with PCRE on windows-1252 encoding and
> UTF-8-encoding the same time. If PCRE supports UTF-16 or at least 16 Bit wide
> strings natively in future versions, I would considering using those instead of
> converting our strings to UTF-8 before using PCRE. But of course I could only
> do this for new code and would really appreciate if all modes could be used in
> the same class.


That's the kind of application I figured might exist. It should be
possible to do what you want, at the cost of using two libraries instead
of one.

> Just as a hint: In January this year I asked for supporting std::wstring in
> pcrecpp. Is this something to consider now again? The topic was "implementing
> support for std::wstring in pcrecpp".


The C++ wrapper is maintained by some folks at Google, not by me. (I'm
not a C++ programmer ... somehow got away without ever needing to learn
it, not sure how. :-) Once we have 16-bit support in the native library,
extending it to C++ is an obvious next step.

Philip


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email