https://bugs.exim.org/show_bug.cgi?id=1670
Bug ID: 1670
Summary: --color produces invalid UTF-8 for property matches
Product: PCRE
Version: 10.10 (PCRE2)
Hardware: x86-64
OS: Linux
Status: NEW
Severity: bug
Priority: medium
Component: Code
Assignee: ph10@???
Reporter: eximbugs.apriori@???
CC: pcre-dev@???
Iâve noticed when combining the --color option with matches against Unicode
properties, the output is sometimes garbled (giving me a red replacement
character followed by one or more normal replacement characters).
I can reproduce this with accented characters on both PCRE and PCRE2.
Precomposed à :
echo 'Ã '|pcre2grep --color=always '\p{L}'|xxd
00000000: 1b5b 313b 3331 6dc3 1b5b 3030 6da0 0a .[1;31m..[00m..
a with combining diacritic:
echo 'aÌ'|pcre2grep --color=always '\p{L}'|xxd
00000000: 1b5b 313b 3331 6d61 1b5b 3030 6d1b 5b31 .[1;31ma.[00m.[1
00000010: 3b33 316d cc1b 5b30 306d 800a ;31m..[00m..
Circled letter a:
echo 'â'|pcre2grep --color=always '\p{L}'|xxd
00000000: 1b5b 313b 3331 6de2 1b5b 3030 6d93 900a .[1;31m..[00m...
Fraction ½
echo '½'|pcre2grep --color=always '\p{N}'|xxd
00000000: c21b 5b31 3b33 316d bd1b 5b30 306d 0a ..[1;31m..[00m.
--
You are receiving this mail because:
You are on the CC list for the bug.