Re: [pcre-dev] Another release candidate

Top Page

Reply to this message
Author: Sheri
To: pcre-dev
Subject: Re: [pcre-dev] Another release candidate
Hi Philip,

Philip Hazel wrote:
> I have fixed the bugs that Tavis reported. I have also done two things
> to try to make Sheri happy:
> (1) When CRLF is a valid line ending, if a pattern contains an explicit
> match for \r or \n (either as escapes, or as literals, or indirectly via
> things like [^b]) then the infamous "change 46" does not happen. That
> is, after a failed match at a CRLF point, the match point advances by
> only one character not two. If there are no explicit \r or \n
> references, the advance is by 2. This means that now, with ANYCRLF set:
>    \nA       matches \r\nA
>    [\r\n]A   matches \r\nA
>    (\r|\n)A  matches \r\nA 

> but
>    .+A       does not match \r\nA 

> I have updated the documentation, and noted that nevertheless there may
> still be anomalies.

I built it, ran your test suite and ours and did not encounter any
anomolies as such. I suspect it would take some time to discover
something potentially out of line. I'd like to see my editor (NoteTab
Pro) built with this RC to see what happens, but the author gets his
PCRE/Delphi API thru a third party. Nonetheless I will send him an email
to see if some prerelease testing would be possible.

> (2) I have added recognition of (*CR), (*LF), (*CRLF), (*ANYCRLF), and
> (*ANY), at the start of pattern only, to override the line ending
> convention. These strings *must* be at the start of the pattern, and in
> upper case. I've taken the syntax from Perl's new controls like
> (*PRUNE).

Excellent and thank you.
> Sheri: I thought about allowing (?#...) comments before but decided
> against it for two reasons: (1) more work, (2) since (*CR) etc are not
> really part of the pattern, I don't think they should be allowed within
> it.


> You are pushing me in a direction in which I don't really want to go. I
> still think the "right" way of doing all the options which apply to the
> whole pattern (as opposed to being changeable within parts of it - as
> (?J) is, incidentally) is for the application to have syntax that it
> then passes in the options bits. I know the POSIX interface doesn't have
> that - but then it is a weaker interface. I think I'm going to stick at
> "you should use the native interface if you want all the fancy
> facilities".

Fair enough.

I realize I'm speaking for PCRE's users by proxy while your direct users are the developers and applications. Even with the native api, the app developer must explicitly support specific external options. So your features will get more mileage where you choose to make them directly accessible to end users.