https://bugs.exim.org/show_bug.cgi?id=2315
Bug ID: 2315
Summary: PCRE2_NEWLINE_ANYCRLF appears to be nonfunctional
Product: PCRE
Version: 10.32 (PCRE2)
Hardware: x86-64
OS: MacOS X
Status: NEW
Severity: bug
Priority: medium
Component: Code
Assignee: ph10@???
Reporter: siegel@???
CC: pcre-dev@???
Created attachment 1110
-->
https://bugs.exim.org/attachment.cgi?id=1110&action=edit
test code demonstrating failing match with
I'm upgrading a large commercial code base (BBEdit) to use PCRE2.
Because this product has been around for a very long time, we've had to make
some accommodations for the needs of legacy customers, in order to avoid
breaking their existing regular expression workflows.
In particular, BBEdit allows the use of "\r" to match a newline in the
document, as a synonym for "\n".
When using PCRE 8.x, I was able to make this work by including
PCRE_NEWLINE_ANYCRLF in the options that I passed to pcre16_compile2(). The
base set of options was PCRE_UCP | PCRE_MULTILINE | PCRE_NEWLINE_ANYCRLF |
PCRE_AUTO_CALLOUT, to which I would add PCRE_CASELESS and PCRE_ANCHORED as
appropriate.
When using PCRE2 (10.32, r1002), I'm finding that patterns including "\r" no
longer match line breaks as expected.
At first I was specifying PCRE2_NEWLINE_ANYCRLF in the options for
pcre2_compile(), which is clearly not correct; I thought that was my bug. So I
adjusted my code to use a compile context, and then used pcre2_set_newline() in
the compile context to set PCRE2_NEWLINE_ANYCRLF, and that did not solve the
issue.
(In case it matters: I'm building my application with "#define
PCRE2_CODE_UNIT_WIDTH 16", because my text document backing store is UTF-16.)
I have attached a bit of code which illustrates the issue; it should be
compilable as is - I extracted it from a test harness in my application, but
haven't tried to run it in isolation yet.
I'd appreciate any advice or corrective guidance that you can provide. :-)
--
You are receiving this mail because:
You are on the CC list for the bug.