[pcre-dev] [Bug 1037] Incorrect using of GETCHARLENGTH(...) …

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1037] Incorrect using of GETCHARLENGTH(...) in pcre_compile.c
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1037

Philip Hazel <ph10@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID





--- Comment #2 from Philip Hazel <ph10@???> 2010-11-19 16:54:41 ---
I am sorry, but making this change does not work. The value of ptr must be
increased by the number of additional bytes in the UTF-8 character so that it
ends up pointing to the final byte. I tested your change on a character class
containing the 4 bytes 0x41 0xc4 0xa3 0x42 which are the three characters 'A',
\x{123}, and 'B'. Using pcretest's -b option to show the output, unmodified
PCRE gives this, which is correct:

------------------------------------------------------------------
  0  45 Bra
  3     ^
  4     [AB\x{123}]+
 45  45 Ket
 48     End
------------------------------------------------------------------


After applying your patch, the result is this, which is wrong:

------------------------------------------------------------------
  0  45 Bra
  3     ^
  4     [AB\xa3\x{123}]+
 45  45 Ket
 48     End
------------------------------------------------------------------


I am not surprised; I expected it to go wrong.

The problem you had was perhaps because two arguments to GETCHARLEN are ptr.

I am going to close this issue as INVALID, but I will look at your later patch
separately.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email