Author: ph10 Date: To: Ze'ev Atlas CC: pcre-dev@exim.org Subject: Re: [pcre-dev] issues with EBCDIC and pcretest
On Thu, 18 Jun 2015, Ze'ev Atlas wrote:
I have added \x41 to the list that is recognized by \h and committed the
patch.
> An interesting point: The Perlre in perldocs (5.20), document states: (The following all specify the same class of three characters: [-az] , [az-] , and [a\-z] . All are different from [a-z] , which specifies a class containing twenty-six characters, even on EBCDIC-based character sets.)
>
> Apparently, Perl somehow recognizes [a-z] and treats it as a special case in EBCDIC and ignore the non-letters gaps. This is news to me. Dis you know that? I intend to ask in the perl-mvs forum what do they do about it.
I did not know that. PCRE does not treat [a-z] as special.
> Obviously, I know that \p and \P are useless, but the tests are odd, and I am trying to reduce the level of oddity as much as I could.
There was a bug. It was not diagnosing an error for \p and \P within a
class when UCP support was disabled. I have fixed that.
> While 0x41 is indeed not in any class that I may have thought about,
> 0x25, is actually in some. > /[\h]/BZ ------------------------------------------------------------------
Bra
[\x05\x0b-\x0d\x15\x25 ]
Ket
End
------------------------------------------------------------------
That is wrong! It should only be \x05, space, and (now) \x41. Those
vertical spaces should not be there. Can you check again, please?
> /[\v]/BZ ------------------------------------------------------------------
Bra
[\x0b-\x0d\x15\x25]
Ket
End
------------------------------------------------------------------