[pcre-dev] [Bug 1049] Add support for UTF-16

Top Page
Delete this message
Author: Sheri
Date:  
To: pcre-dev
Old-Topics: [pcre-dev] [Bug 1049] New: Add support for UTF-16
Subject: [pcre-dev] [Bug 1049] Add support for UTF-16
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1049




--- Comment #37 from Sheri <silvermoonwoman@???> 2011-12-21 17:53:36 ---
On 12/21/2011 7:16 AM, Zoltan Herczeg wrote:
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>
> http://bugs.exim.org/show_bug.cgi?id=1049
>
>
>
>
> --- Comment #36 from Zoltan Herczeg<hzmester@???> 2011-12-21 12:16:19 ---
> Hi all,
>
> we had a discussion with Philip about the 16 bit support of pcregrep and it
> would be a good idea to hear more opinion about it. The following questions
> were raised:
>
> - Does anyone interested to use it? This is actually an important question,
> since if noone wants to use it, it probably does not worth the effort.
> - Does anyone know other 16 bit greps? Do they accept 16 bit patterns from
> command line? If so, how? Do they treat all input as 16 bit data?
> - Shall we accept 8 bit patterns in 16 bit mode? (Note: patterns can be read
> from a file)
> - What features do you want?
>
>


My 2 c: If pcregrep supported 16-bit, yes it should accept 8-bit
patterns and file names in 16-bit mode to allow grep requests to be run
at the command line. But since stdin is not unicode, input data would
need to come from other files. Without special coaxing the CMD console
will not display unicode, so in a CMD console window, user would likely
redirect to a file (which I suppose should get a BOM?) for viewing
elsewhere. But if using a PowerShell console window, viewing results in
the console window might work.

If you want Windows users to run any tests on the preliminary work
that's already done for 16-bit, you should update the cmake files and
run the PrepareRelease script. Maybe you could name test versions "PT"
instead of "RC" for "preliminary test".

Here are some interesting links about the Windows console:

<http://blogs.msdn.com/b/michkap/archive/2010/10/07/10072032.aspx>

<http://perlingresprogramming.blogspot.com/2010/11/printing-unicode-on-windows-console-and.html>


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email