https://bugs.exim.org/show_bug.cgi?id=1894
Petr Pisar <ppisar@???> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |ppisar@???
--- Comment #1 from Petr Pisar <ppisar@???> ---
The [а-Ñ] range does not mean all Cyrillic symbols. It means Unicode character
from range U+0430 to U+044F. And as you correctly noted, Ñ is out of the range
(U+0451). Therefore it should not match:
printf '/[а-Ñ]*/8\nеÑÑ\n' | pcretest
PCRE version 8.39 2016-06-14
re> data> 0: \x{435}\x{449}
data>
If you extend the range up to Ñ, it will match:
$ printf '/[а-Ñ]*/8\nеÑÑ\n' | pcretest
PCRE version 8.39 2016-06-14
re> data> 0: \x{435}\x{449}\x{451}
data>
But there is a better way: Instead of Unicode range you can use Unicode script
name. This is because sometimes the Unicode ranges contain characters from
foreign scripts or an unassigned code points:
$ printf '/\p{Cyrillic}*/8\nеÑÑ\n' | pcretest
PCRE version 8.39 2016-06-14
re> data> 0: \x{435}\x{449}\x{451}
data>
--
You are receiving this mail because:
You are on the CC list for the bug.