[pcre-dev] Partial matching of UTF8 multibyte chars at the e…

Inizio della pagina
Delete this message
Autore: ND
Data:  
To: Pcre-dev
Oggetto: [pcre-dev] Partial matching of UTF8 multibyte chars at the end of segment
Hi, Philip!

UTF8 data flow may comes with segments which end bytes defines beginning
bytes of UTF8 multibyte character but not character at all. The end bytes
of character will come with next segment. This occurs, for example, when
UTF8 text is transmitted by byte-oriented protocol like TCP.
Now PCRE returns PCRE_ERROR_BADUTF8 in this situation. But IMHO its
rightly to return PCRE_ERROR_PARTIAL.

Best regards.