[pcre-dev] [Bug 791] UTF-8 support does not work on EBCDIC p…

Top Page
Delete this message
Author: Martin Jerabek
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 791] UTF-8 support does not work on EBCDIC platforms
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=791




--- Comment #6 from Martin Jerabek <martin.jerabek@???> 2008-12-19 14:31:48 ---
On 17.12.2008 16:24, Philip Hazel wrote:
> On Wed, 17 Dec 2008, Martin Jerabek wrote:
>
>
>> If it is acceptable to you I will
>> modify the sources in such a way that I replace all character constants
>> with macros which are defined as normal literals (e.g. '*') or as UTF-8
>> literals (e.g. '\x2A') depending on --enable-utf8:
>>
>
> Yes, that seems OK to me.
>


I finished changing most character and string literals to macros whose
definition depends on SUPPORT_UTF8. I verified on Linux 64-bit and AIX
32-bit that RunTest and RunGrepTest do not report any errors after my
modifications. I also compiled the sources on z/OS, once with
--enable-ebcdic and once with --enable-utf8 (actually with
--enable-unicode-properties which implies --enable-utf8). Unfortunately
the official test cases do not work on EBCDIC platforms. I made a quick
check that pcregrep still works in trivial cases in EBCDIC mode, and I
made tests with our own sources that the UTF-8 mode on z/OS works as
expected.

I also made slight modifications to pcregrep.c and pcretest.c: the
return value of pcre_config(PCRE_CONFIG_NEWLINE) was checked against
character literals ('\r', '\n') but as the API documentation says this
function always returns 10, 13, 3338, -2, or -1, even on EBCDIC platforms.

In what form do you want my modifications? Simply all modified source
files, or just diffs? If diffs, in which format? Which archive format
(tar, pax, gzip, bzip2, zip)? I assume I should create an attachment
with my modifications to the Bugzilla entry, right?

Regards
Martin Jerabek


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email