Autor: Graycode Datum: To: Pcre-dev Betreff: Re: [pcre-dev] SUB symbol inside MARK verb
On Sat, 13 Apr 2013, ND wrote:
> If pattern contains a SUB symbol (ASCII code 26, hex 1A) inside MARK verb then
> end-of-file error is detected.
>
>
> /(*:a?b)/
> a
>
> PCRE version 8.33-RC1 2012-12-07
> /(*:a** Unexpected EOF
>
>
>
> Is this a planned behaviour?
It is normal behaviour in Windows when a file is opened with 'r' read
but not 'b' binary mode. That's what pcretest uses, it reads and
processes the input file in text mode.
Windows is signaling an EOF condition because it's interpreting the
characters being read from the file when in text mode. The same EOF
could happen outside of expressions, such as using raw binary
characters for the data to be tested.
This is not a problem with PCRE, per say, it's just a side effect of
the way the pcretest program works in Windows and perhaps other
environments.
Specify your expression as /(*:a\x1Ab)/ and pcretest will then be
able to properly interpret the request.
It's similar to why you shouldn't specify your expression with a raw
imbedded line feed vs. specifying '\x0A' if that's what it was
looking for. In that case pcretest might also be given a carriage
return though you intended for none to be there.