[pcre-dev] [Bug 1616] Line begin anchor fits not at end of t…

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1616] Line begin anchor fits not at end of text, if the last character is a new line character
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1616




--- Comment #3 from Philip Hazel <ph10@???> 2015-04-17 11:10:36 ---
On Fri, 17 Apr 2015, David Gausmann wrote:

> But it is hard to find and the behaviour is not intuitive, but if this is the
> behaviour of PERL then PCRE should behave like that.


One person's intuition is another's nonsense. :-) I don't *know* why
Perl behaves as it does, but I can see the logic -- in multiline mode,
it treats the input as a sequence of terminated lines, so, for example,
"A\nB\nC\n" is three lines. ^ matches at the start of a line; hence
(only) three matches.

The other engines take the more character-based view: ^ matches at the
start or after a line terminator; hence four matches.

> Could you maybe implement an option flag, which allows to use the non-PERL
> behaviour for the line begin anchor?
> Then PCRE could be used as compatible replacement for weak flavors like those
> from JavaScript/VBScript.


The 8.xx series of PCRE releases (now called PCRE1) are in "maintenance
only" mode. All development is now happening in the 10.xx series - known
as PCRE2, which has a completely revised API. It would be very easy to
add such an option to PCRE2 (a new option bit for pcre2_match() and a
little bit of code). I have made a note to do this.

Regards,
Philip


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email