Re: [pcre-dev] First slot of the offset vector have a wrong …

Top Page
Delete this message
Author: ND
Date:  
To: Pcre-dev
Subject: Re: [pcre-dev] First slot of the offset vector have a wrong value when PCRE_ERROR_SHORTUTF8 rises
On 2011-02-05 15:17, Philip Hazel wrote:

> I treated PCRE_ERROR_SHORTUTF8 as an error, so the offsets are not set


> I realized that I cannot make pcre_exec() do what you want.
> PCRE_ERROR_SHORTUTF8 is
> given instead of PCRE_ERROR_BADUTF8, and both are generated during the
> check for UTF-8 validity that occurs right at the start of pcre_exec()
> before it does any actual matching. Therefore, it does not have an
> starting match value to put in the offsets



I think the problem may be resolved if PCRE will put a position of
incomplete UTF-8 character as start offset.