Auteur: Giuseppe D'Angelo Date: À: Frank Chang CC: pcre-dev Sujet: Re: [pcre-dev] Is it possible to use UTF-8 literal characters in a
C/C++ PCRE regex?
On 28 June 2012 19:00, Frank Chang <frankchang91@???> wrote: > Good afternoon, We are trying to match the German string. Munich
> tausendschöne Jungfräulein ausendschçne, using a C/C++ PCRE regex with
> PCRE_UTF8, PCRE_UCP, PCRE_CASELESS options activated which uses the UTF-8
> literals, ö, ä, ç Is it possible to construct a valid PCRE regex which uses
> the UTF-8 literals ö or ä or ç without using codepoints?
It *is* possible, but you must ensure that the execution charset of
your compiler is set to properly output UTF-8 sequences. Is it the
case? Try getting an hex dump of the string literal you're passing to
pcre_compile (eventually, try looking at the assembler output).