[pcre-dev] [Bug 633] pcre_get_substring*: Problems with UTF8

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 633] pcre_get_substring*: Problems with UTF8
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=633




--- Comment #1 from Philip Hazel <ph10@???> 2007-11-20 18:52:10 ---
On Tue, 20 Nov 2007, Olaf Walkowiak wrote:

> The pcre_get_substring family of functions has Problems if the "haystack" ist
> UTF8. The ovector from pcreRegexExecute are treated as single byte, so if
> "haystack" contains UTF8 Characters, the result is truncated.


What is pcreRegexExecute? It is not part of the PCRE library. I also
cannot understand what you mean by "haystack".

The ovector values that are returned from pcre_exec() are byte values,
so that is what pcre_get_substring() (for example) expects.

Regards,
Philip


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email