Author: Issaana
Date:
To: pcre-dev
Subject: [pcre-dev] BACKREFERENCE with (PCRE_UTF8|PCRE_CASELESS) is a
unexpected result
Hello,
I built PCRE with SUPPORT_UTF8 and SUPPORT_UCP, and tried the following code.
re=pcre_compile("(\xc3\x80)\\1",PCRE_UTF8|PCRE_CASELESS,&err,&erroff,NULL);
rc=pcre_exec(re,NULL,"\xc3\x80\xc3\x80",4,0,0,ov,6); //(A) rc=2
rc=pcre_exec(re,NULL,"\xc3\x80\xc3\xa0",4,0,0,ov,6); //(B) rc=PCRE_ERROR_NOMATCH
\xc3\x80 is UTF-8 code of U+00C0 (LATIN CAPITAL LETTER A WITH GRAVE)
\xc3\xa0 is UTF-8 code of U+00E0 (LATIN SMALL LETTER A WITH GRAVE)
(B) is a unexpected result. Which of a bug or my misunderstanding is it?
Thanks,
Issaana