[pcre-dev] [Bug 633] New: pcre_get_substring*: Problems with…

Pàgina inicial
Delete this message
Autor: Olaf Walkowiak
Data:  
A: pcre-dev
Assumptes nous: [pcre-dev] [Bug 633] pcre_get_substring*: Problems with UTF8
Assumpte: [pcre-dev] [Bug 633] New: pcre_get_substring*: Problems with UTF8
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=633
           Summary: pcre_get_substring*: Problems with UTF8
           Product: PCRE
           Version: N/A
          Platform: Other
        OS/Version: Linux
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
        AssignedTo: ph10@???
        ReportedBy: olaf@???
                CC: pcre-dev@???



The pcre_get_substring family of functions has Problems if the "haystack" ist
UTF8. The ovector from pcreRegexExecute are treated as single byte, so if
"haystack" contains UTF8 Characters, the result is truncated.

Extracting the strings with xmlUTF8Strsub (from libxml) it works as expected


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email