https://bugs.exim.org/show_bug.cgi?id=1894
Bug ID: 1894
Summary: In UTF8 Locale Russian Cyrillic [а-Ñ] range contains
only 32 of 33 letters
Product: PCRE
Version: 8.32
Hardware: x86-64
OS: Linux
Status: NEW
Severity: bug
Priority: medium
Component: Code
Assignee: ph10@???
Reporter: ikonta@???
CC: pcre-dev@???
Originally find on EL7 (with pcre-8.32) issue seems to be more common.
Modern Russian alphabet contains 33 letters.
Standard UTF8 rage covers 32 of most common, but misses one ('Ñ').
Standard
U+0410 Ð
â¦
U+044F Ñ
Exceptions:
U+0401 Ð
U+0451 Ñ
http://www.utf8-chartable.de/unicode-utf8-table.pl?start=1024
[а-Ñ] range should include 'Ñ' (and [Ð-Я] â 'Ð') letter, but actually do not.
Forwarded here from downstream tracker, see
https://bugs.php.net/bug.php?id=73251
$valid_string_expr = '/^[а-Ñ]+$/u';
var_dump(preg_match($valid_string_expr, $str));
$str = "еÑÑ";
var_dump(preg_match($valid_string_expr, $str));
Second regexp fails, although should not.
--
You are receiving this mail because:
You are on the CC list for the bug.