Re: [pcre-dev] A clarification upon PCRE_NO_UTF(8|16)_CHECK

Top Page
Delete this message
Author: Giuseppe D'Angelo
Date:  
To: Zoltán Herczeg
CC: pcre-dev
Subject: Re: [pcre-dev] A clarification upon PCRE_NO_UTF(8|16)_CHECK
2012/4/7 Zoltán Herczeg <hzmester@???>:
>
> So your guess is right: if the subject string does not change, PCRE_NO_UTF8_CHECK can be passed to subsequent calls (regardless if the pattern changes, or it does not match or anything).


Ok, great!

> Moreover, if QString guarantees that it always contains a valid UTF16 string, you can pass PCRE_NO_UTF8_CHECK all the time (just check that the starting offset also points to a beginning of a valid UTF16 character. This is easy, just check that the memory location does not point to a second part of a surrogate).


Unfortunately that's not the case: from this point of view, QString is
a "mere" container of UTF16 code units. One has access to the
underlying data and can easily build an illegal UTF16 sequence. :-(

> Btw, I heard that Qt5 alpha is released and contains a PCRE based QRegularExpression. Really nice job!


Thank you very much! Another one for the list on the homepage ;-)

Cheers,
--
Giuseppe D'Angelo