------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1020
Summary: Wrong pcre_exec() with some UTF8 chars in the pattern
Product: PCRE
Version: 8.10
Platform: x86-64
OS/Version: Windows
Status: NEW
Severity: bug
Priority: medium
Component: Code
AssignedTo: ph10@???
ReportedBy: udmitry@???
CC: pcre-dev@???
Hi, Philip,
I have wrong pcre_exec results in UTF8 with NEWLINE_ANY and EXTENDED.
Pattern has 6 byte (4 UTF8 chars) - one char and 3 remark:
p = 'A#\xd1\x85\xd1\x86';
Line has 3 chars (english):
L := 'BAB';
Test code is (Pascal-based):
/************************/
re := pcre_compile(P, PCRE_EXTENDED or PCRE_UTF8 or PCRE_NEWLINE_ANY, @ePtr,
@eo, nil);
if re <> nil then
Cnt := pcre_exec(re, nil, TestLine, Length(TestLine), 0, PCRE_NOTEMPTY,
@(Vector[0]), Length(Vector));
/************************/
In this case Cnt == -1
if I change pattern symbol \xd1\x86 to another symbol, it works fine. For
example:
p = 'A#\xd1\x86\xd1\x86';
...
Cnt == 1 !!!
If i remove PCRE_NEWLINE_ANY option from pcre_compile(), it works too.
Thanks,
Dmitry Ukolov.
--
Configure bugmail:
http://bugs.exim.org/userprefs.cgi?tab=email