[pcre-dev] [Bug 2315] PCRE2_NEWLINE_ANYCRLF appears to be n…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 2315] PCRE2_NEWLINE_ANYCRLF appears to be nonfunctional
https://bugs.exim.org/show_bug.cgi?id=2315

--- Comment #2 from Philip Hazel <ph10@???> ---
First let me say that the behaviour of PCRE2 should be the same as PCRE1 in
this area, the only difference being the API and how you set the options. Your
code has these two lines:

        const uint16_t          pattern[] = { 'a', '\r', 0 };
        const uint16_t          subject[] = { 'a', '\n' };


The pattern is explicitly looking for 'a' followed by '\r'. This doesn't
involve any newline interpretation at all. The meaning of PCRE2_NEWLINE_xxx is
"if looking for a newline, these are to be recognized". It does not mean "if
you find one of these in a pattern, use it to match any allowed newline".
Changing the lines to

        const uint16_t          pattern[] = { 'a', '$', 0 };    
        const uint16_t          subject[] = { 'a', '\r', 'b' };  


Causes the match to work - giving a return value of 1 (note that >= 0 indicates
success, not just == 0). (I had to tweak your code to get it to compile with
gcc, changing nil to NULL, UniChar to uint16_t, and removing the call to
verify(), but I don't think those were relevant to your issue.) Also, removing
the call to pcre2_set_newline() causes the match to fail.

I added the extra 'b' so that the '\r' was not at the end of the string, which
checks that MULTILINE is working, but without the 'b' it still works.

So, there must be something deeper going on here. Can you give an example that
worked with PCRE1 but does not with PCRE2? Or indeed the code you used with
PCRE1?

--
You are receiving this mail because:
You are on the CC list for the bug.