Author: Ze'ev Atlas Date: To: Ze'ev Atlas, pcre exim Subject: Re: [pcre-dev] issue with Posix on z/OS
Hi All
Referring to the snippet below. I ran a comparison of all the z/OS PCRE library with the 8 bits PCRE library 8.33. Except of the normal changes (forcing the CONFIG.h, changing all external names to z/OS mandated 8 characters upper case and so on,) the code is identical between the two libraries. Remember that ultimately, the library and the intended target character set are EBCDIC and not ASCII nor UTF-8. That necessitated replacing 'ÿ' with \xFF and '^' with the IBM version "logical not" sign '¬'. However, all those changes worked fine in all modules since 8.31 and they work fine in pcregrep.
I wonder whether pcre_dfa_exec.c, pcreposix.h or pcreposix.c contain some reference to ASCII specific character or sequence that might cause it to balk on EBCDIC.
I did find things like this in pcre_dfa_exec.c, which is obviously incorrect in EBCDIC context:
<snippet>
#ifdef PCRE_DEBUG
printf ("%.*sProcessing state %d c=", rlevel*2-2, SP, state_offset);
if (clen == 0) printf("EOL\n");
else if (c > 32 && c < 127) printf("'%c'\n", c);
else printf("0x%02x\n", c);
#endif
<snippet>
or
<snippet>
#ifndef EBCDIC
case 0x2028:
case 0x2029:
#endif /* Not EBCDIC */
<snippet>
which I do not know what to make of, but I do not think any of those caused the issue.
The program crashes in a loop using the pchars function and I believe it is pcretest.c itself.
<snippet>
The -p option cause it (pcretest) to get into an infinite loop of printing \x00 and ultimately crash when the program tries to write the next \x00 beyond the buffer size (I gave it 30K and it used all of them... the printout below just demonstrate the first 305 characters of the printout.)
--------------------------------
PCRE version 8.33 2013-05-28
/-- This set of tests is for features that are compatible with all versions of
Perl >= 5.10, in non-UTF-8 mode. It should run clean for both the 8-bit and
16-bit PCRE libraries. --/
/the quick brown fox/
the quick brown fox
0:
1: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
--------------------------------
<snippet