Author: ph10 Date: To: Ze'ev Atlas CC: pcre-dev@exim.org Subject: Re: [pcre-dev] issues with EBCDIC and pcretest
On Wed, 17 Jun 2015, Ze'ev Atlas wrote:
> Hi PhilipThis might not be important, but I would like to ask. Please review the below differences in behavior and see if there is any place for alarm. Granted these are in testinput11 and by definition are not interesting, but....
> on testouput11-8
/[\p{L}]/BM
Memory allocation (code space): 15
------------------------------------------------------------------
0 11 Bra
3 [\p{L}]
11 11 Ket 14
End
------------------------------------------------------------------
> On the EBCDIC version:
> /[\p{L}]/BM Memory allocation (code space): 40
------------------------------------------------------------------
0 36 Bra
3 [p{}L]
36 36 Ket
39 End
------------------------------------------------------------------
That is not right, and I suspect all the others are the same. The \p and
\P escapes test Unicode properties (see the note at the top of test 11).
They should not work in an environment where Unicode properties are not
supported. You should get a compile-time error (unless you have
accidentally set SUPPORT_UCP in config.h). It seems to have treated \p
as a literal "p", which is correct if SUPPORT_UCP is not set and the
PCRE_EXTRA option is set (which it doesn't seem to be, so that's a bit
odd).
Basically, I would not expect \p and \P to be useful in EBCDIC
environments. What happens if you use them outside a class? E.G.