------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=633
--- Comment #3 from Philip Hazel <ph10@???> 2007-11-21 16:05:22 ---
> haystack = "some umlauts äöüß and more";
>
>
> *compiled_regexp = pcre_compile(regexp, /* the pattern */
What was the pattern you were using? I tried matching that string using the
pattern /(.*)/ in pcretest, and it worked fine.
> When extracting to match from "haystack" with pcre_get_substring the result is
> truncated and misses one char for each umlaut.
The return from pcre_get_substring was 30 (the number of bytes), and all the 30
bytes were returned. The test I used was
pcretest zz zzz
where zz contained these lines
/(.*)/8
some umlauts \x{e4}\x{f6}\x{fc}\x{df} and more\G1
The output in the zzz file was as expected. It doesn't show well on my screen
because of the top-bit characters. This has exposed a bug in pcretest; it
should be showing those characters as escapes. I will fix that.
I am afraid that I am not convinced that this is a bug.
--
Configure bugmail:
http://bugs.exim.org/userprefs.cgi?tab=email