[pcre-dev] [Bug 1100] Crash backtracking over unicode sequen…

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Old-Topics: [pcre-dev] [Bug 1100] New: Crash backtracking over unicode sequence
Subject: [pcre-dev] [Bug 1100] Crash backtracking over unicode sequence
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1100

Philip Hazel <ph10@???> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED





--- Comment #8 from Philip Hazel <ph10@???> 2011-07-19 11:05:19 ---
I have committed a patch that fixes this. I appreciate your efforts in trying
to generate a patch, but in fact you were looking in the wrong place! The bug
was a dozen or so lines above where your patch applies. It was triggered by \X
trying to match at a point where the first character had the M property. This
should fail under the current definition of \X, which is "an extended Unicode
sequence",
documented as being equivalent to (?>\PM\pM*) and the fixed code now does fail.
However, while investigating this, I discovered that Perl has changed its
definition of \X to "extended grapheme cluster", which is a Unicode artefact in
which the initial \PM character is optional. I am wondering whether I should
change PCRE, or provide an option to do so.


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email