https://bugs.exim.org/show_bug.cgi?id=2062
Bug ID: 2062
Summary: matching Unicode class Pd seems to be hitting
additional wrong characters (Po?)
Product: PCRE
Version: N/A
Hardware: All
OS: All
Status: NEW
Severity: bug
Priority: medium
Component: Code
Assignee: ph10@???
Reporter: exim-bugtracker@???
CC: pcre-dev@???
Pd class (dashes and dash-alike characters) should not include characters from
the following list (and more):
,:/.
Pd class should only contain these characters there:
http://www.fileformat.info/info/unicode/category/Pd/list.htm
(which is in line with my understanding of
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt, ',.:/' are all in Po,
possibly the complete set of Po is matched by Pd?)
Using PHP, I get matches to the list above on the regex "/[\pPd]/u".
Online, I get the same result e.g. on
https://regex101.com/
regex: [\pPd]
flags: gu
input: Alpha,Beta:Gamma/Delta.
matches: All the non-letters
expected: zero matches
--
You are receiving this mail because:
You are on the CC list for the bug.