[pcre-dev] [Bug 1670] New: --color produces invalid UTF-8 fo…

Page principale
Supprimer ce message
Auteur: admin
Date:  
À: pcre-dev
Sujet: [pcre-dev] [Bug 1670] New: --color produces invalid UTF-8 for property matches
https://bugs.exim.org/show_bug.cgi?id=1670

            Bug ID: 1670
           Summary: --color produces invalid UTF-8 for property matches
           Product: PCRE
           Version: 10.10 (PCRE2)
          Hardware: x86-64
                OS: Linux
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: ph10@???
          Reporter: eximbugs.apriori@???
                CC: pcre-dev@???


I’ve noticed when combining the --color option with matches against Unicode
properties, the output is sometimes garbled (giving me a red replacement
character followed by one or more normal replacement characters).

I can reproduce this with accented characters on both PCRE and PCRE2.

Precomposed à:
echo 'à'|pcre2grep --color=always '\p{L}'|xxd
00000000: 1b5b 313b 3331 6dc3 1b5b 3030 6da0 0a    .[1;31m..[00m..


a with combining diacritic:
echo 'à'|pcre2grep --color=always '\p{L}'|xxd
00000000: 1b5b 313b 3331 6d61 1b5b 3030 6d1b 5b31  .[1;31ma.[00m.[1
00000010: 3b33 316d cc1b 5b30 306d 800a            ;31m..[00m..


Circled letter a:
echo 'ⓐ'|pcre2grep --color=always '\p{L}'|xxd
00000000: 1b5b 313b 3331 6de2 1b5b 3030 6d93 900a .[1;31m..[00m...

Fraction ½
echo '½'|pcre2grep --color=always '\p{N}'|xxd
00000000: c21b 5b31 3b33 316d bd1b 5b30 306d 0a    ..[1;31m..[00m.


--
You are receiving this mail because:
You are on the CC list for the bug.