Author: Philip Hazel Date: To: 1049 CC: pcre-dev Subject: Re: [pcre-dev] [Bug 1049] Add support for UTF-16
On Mon, 14 Nov 2011, Thorsten Schöning wrote:
> If I understood you correctly, I'm the one handling both sizes. :-) In one of
> my applications we worked with standard char in windows-1252 codepage and
> needed to support Unicode in some places and decided to use what Windows
> supports per default, which is wchar_t as 16 Bit datatype with UTF-16 encoding.
> This resulted in classes which work with PCRE on windows-1252 encoding and
> UTF-8-encoding the same time. If PCRE supports UTF-16 or at least 16 Bit wide
> strings natively in future versions, I would considering using those instead of
> converting our strings to UTF-8 before using PCRE. But of course I could only
> do this for new code and would really appreciate if all modes could be used in
> the same class.
That's the kind of application I figured might exist. It should be
possible to do what you want, at the cost of using two libraries instead
of one.
> Just as a hint: In January this year I asked for supporting std::wstring in
> pcrecpp. Is this something to consider now again? The topic was "implementing
> support for std::wstring in pcrecpp".
The C++ wrapper is maintained by some folks at Google, not by me. (I'm
not a C++ programmer ... somehow got away without ever needing to learn
it, not sure how. :-) Once we have 16-bit support in the native library,
extending it to C++ is an obvious next step.