[pcre-dev] [Bug 1058] Feeding bad utf8 strings confuses pcre…

Startseite
Nachricht löschen
Autor: Philip Hazel
Datum:  
To: pcre-dev
Betreff: [pcre-dev] [Bug 1058] Feeding bad utf8 strings confuses pcre_exec()
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1058

Philip Hazel <ph10@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID





--- Comment #2 from Philip Hazel <ph10@???> 2011-01-12 16:55:43 ---
Re-visiting this, I don't understand your comment. You said your target string
was "+\001\177", but you did not say what length you gave to pcre_exec() [the
subject is specified as a pointer + length]. All three bytes in your string are
less than 0x80 (\200) and so will not be treated as starting a UTF-8 character.
I tried to reproduce any oddity using pcretest on the latest release, and in
all cases the return is just a match on the first "+" character.

What you submitted is NOT bad Unicode, or rather, it is not bad UTF-8; it's a
valid string of 3 single bytes (or 4 if you gave the length of 4).

I am going to close this report as "invalid". If there is further data that
might help me diagnose a problem, please re-open it.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email