Autor: Philip Hazel Fecha: A: Manohar S Cc: pcre-dev Asunto: Re: [pcre-dev] PCRE with UTF-8
On Thu, 26 Jun 2008, Manohar S wrote:
> I have attached the actual string for which ovector is not filled up
> properly.
> It seems my UTF-8 characters are not shown in the mail properly.
> Please find the attachment with proper UTF-8 text.
When I run your pattern and string through pcretest, it works fine:
PCRE version 7.7 2008-05-07
/[\'\"][\x{80}-\x{ffff}a-zA-Z0-9]+[\'\"];/8
select * from account where a = 'ਠਡਢà²à²à²µà²·à²¡à²¢à²£à²¤à²¥à²µà²·à²¡à²¢à²£à²¤à²¥';
0: '\x{a20}\x{a21}\x{a22}\x{c89}\x{c89}\x{cb5}\x{cb7}\x{ca1}\x{ca2}\x{ca3}\x{ca4}\x{ca5}\x{cb5}\x{cb7}\x{ca1}\x{ca2}\x{ca3}\x{ca4}\x{ca5}';
(It always shows the captured strings using escapes to avoid display
problems.) So something in your code is not working properly. It is
filling ovector correctly for pcretest.