Author: ph10 Date: To: Ze'ev Atlas CC: pcre-dev@exim.org Subject: Re: [pcre-dev] issues with EBCDIC and pcretest
On Tue, 16 Jun 2015, Ze'ev Atlas wrote:
> Hi PhilipI could create good EBCDIC equivalent for most cases that are derived of testinput1 (in whic my output differ significantly from the ASCII output). There are three cases which either I misunderstand or pcretest is not playing nicely. > #1/abcd\t\n\r\f\a\e\371\x3b\$\\\?caxyz/
> ------------------------------------------------------------------
> 0 43 Bra
> 3 abcd\x05\x15\x0d\x0c\x2f\x279\x3b$\?caxyz
> 43 43 Ket
> 46 End
> ------------------------------------------------------------------
> Capturing subpattern count = 0
> Contains explicit CR or LF match
> No options
> First char = 'a'
> Need char = 'z'
> abcd\t\n\r\f\a\e9;\$\\?caxyz
> No match > The pattern seems to be correct, yet no match. Suspect - pcretest
The pattern is correct, but the subject is not. The first part,
"abcd\t\n\r\f\a\e9" should match, but then ; will not match \x3b in
EBCDIC because \x3b is an ASCII semicolon. Try changing \3b in the
pattern to \x5e.
> #2
> /(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)\12\123/
> ------------------------------------------------------------------
> 0 117 Bra
> 3 7 CBra 1
> 8 a
> 10 7 Ket
> 13 7 CBra 2
> <SNIP OUT>
> 100 7 Ket
> 103 7 CBra 11
> 108 k
> 110 7 Ket
> 113 \x0aë
> 117 117 Ket
> 120 End
> ------------------------------------------------------------------
> Capturing subpattern count = 11
> No options
> First char = 'a'
> Need char = 'ë'
> abcdefghijk\12S
> No match > The pattern seems to be correct, yet no match. Suspect - pcretest
Same issue; \123 is \x53 which is ASCII "S".
> #3
> /\v*X\v?Y\v+Z\V*\x0a\V+\x0b\V{2,3}\x0c/
> ------------------------------------------------------------------
> 0 31 Bra
> 3 \v*+
> 5 X
> 7 \v?+
> 9 Y
> 11 \v++
> 13 Z
> 15 \V*
> 17 \x0a
> 19 \V++
> 21 \x0b
> 23 \V{2}
> 27 \V?+
> 29 \x0c
> 31 31 Ket
> 34 End
> ------------------------------------------------------------------
> Capturing subpattern count = 0
> No options
> No first char
> Need char = \x0c
> >XY\x15Z\x15A\x0bNN\x0c
This won't match. Up to "\x15Z" is is OK, matching "\v*X\v?Y\v+Z", then
\V* will match nothing (because \x15 matches \v) and then \x15 will fail
to match \x0a.
> No match
> >\x15\x0dX\x15Y\x0a\x0bZZZ\x15AAA\x0bNNN\x0c
> No match
I can see you have turned the \x0a characters in the original subject
into \x15, but you didn't change the one in the pattern.