https://bugs.exim.org/show_bug.cgi?id=2483
Petr Pisar <ppisar@???> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |INVALID
--- Comment #4 from Petr Pisar <ppisar@???> ---
The reproducer can be reduced to:
/[^a]*\x{3c2}/i,utf
\x{d10000}\=no_utf_check
It crashes because the subject text \x{d10000} is not an valid UTF-8 text and
at the same time you disable checks for UTF-8 validity with no_utf_check
subject modifier. If you remove the modifier:
/[^a]*\x{3c2}/i,utf
\x{d10000}
then PCRE performs the check and explains what's wrong with the subject text:
$ pcre2test < test
PCRE2 version 10.33 2019-04-16
/[^a]*\x{3c2}/i,utf
\x{d10000}
Failed: error -13: UTF-8 error: 5-byte character is not allowed (RFC 3629) at
offset 0
This is not a bug. It's a documented behavior. From pcre2api(3) manual:
If you know that your pattern is a valid UTF string, and you want to
skip this
check for performance reasons, you can set the PCRE2_NO_UTF_CHECK
option. When
it is set, the effect of passing an invalid UTF string as a pattern
is undeâ
fined. It may cause your program to crash or loop.
Note that this option can also be passed to pcre2_match() and
pcre_dfa_match(),
to suppress UTF validity checking of the subject string.
--
You are receiving this mail because:
You are on the CC list for the bug.