[pcre-dev] Another release candidate

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
CC: Tavis Ormandy
Subject: [pcre-dev] Another release candidate
ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Testing/pcre-7.3-RC8.tar.gz
ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Testing/pcre-7.3-RC8.tar.bz2
ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Testing/pcre-7.3-RC8.zip

I have fixed the bugs that Tavis reported. I have also done two things
to try to make Sheri happy:

(1) When CRLF is a valid line ending, if a pattern contains an explicit
match for \r or \n (either as escapes, or as literals, or indirectly via
things like [^b]) then the infamous "change 46" does not happen. That
is, after a failed match at a CRLF point, the match point advances by
only one character not two. If there are no explicit \r or \n
references, the advance is by 2. This means that now, with ANYCRLF set:

   \nA       matches \r\nA
   [\r\n]A   matches \r\nA
   (\r|\n)A  matches \r\nA 


but

   .+A       does not match \r\nA 


I have updated the documentation, and noted that nevertheless there may
still be anomalies.

(2) I have added recognition of (*CR), (*LF), (*CRLF), (*ANYCRLF), and
(*ANY), at the start of pattern only, to override the line ending
convention. These strings *must* be at the start of the pattern, and in
upper case. I've taken the syntax from Perl's new controls like
(*PRUNE).

Sheri: I thought about allowing (?#...) comments before but decided
against it for two reasons: (1) more work, (2) since (*CR) etc are not
really part of the pattern, I don't think they should be allowed within
it.

Philip

--
Philip Hazel, University of Cambridge Computing Service.