[pcre-dev] [Bug 2062] New: matching Unicode class Pd seems t…

Top Page
Delete this message
Author: admin
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 2062] New: matching Unicode class Pd seems to be hitting additional wrong characters (Po?)
https://bugs.exim.org/show_bug.cgi?id=2062

            Bug ID: 2062
           Summary: matching Unicode class Pd seems to be hitting
                    additional wrong characters (Po?)
           Product: PCRE
           Version: N/A
          Hardware: All
                OS: All
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: ph10@???
          Reporter: exim-bugtracker@???
                CC: pcre-dev@???


Pd class (dashes and dash-alike characters) should not include characters from
the following list (and more):
,:/.
Pd class should only contain these characters there:
http://www.fileformat.info/info/unicode/category/Pd/list.htm
(which is in line with my understanding of
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt, ',.:/' are all in Po,
possibly the complete set of Po is matched by Pd?)

Using PHP, I get matches to the list above on the regex "/[\pPd]/u".
Online, I get the same result e.g. on https://regex101.com/
regex: [\pPd]
flags: gu
input: Alpha,Beta:Gamma/Delta.
matches: All the non-letters
expected: zero matches

--
You are receiving this mail because:
You are on the CC list for the bug.