[Pcre-svn] [1725] code/trunk: Fix \C backtracking in UTF-8 i…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [1725] code/trunk: Fix \C backtracking in UTF-8 issue for repeated character classes, which were
Revision: 1725
          http://vcs.pcre.org/viewvc?view=rev&revision=1725
Author:   ph10
Date:     2018-02-20 15:45:01 +0000 (Tue, 20 Feb 2018)
Log Message:
-----------
Fix \C backtracking in UTF-8 issue for repeated character classes, which were 
overlooked when it was fixed for other repeats. 


Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/pcre_exec.c
    code/trunk/testdata/testinput5
    code/trunk/testdata/testoutput5


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2018-02-19 16:35:05 UTC (rev 1724)
+++ code/trunk/ChangeLog    2018-02-20 15:45:01 UTC (rev 1725)
@@ -44,7 +44,12 @@
 group that did capture something were not being correctly returned as "unset" 
 (that is, with offset values of -1).


+10. Matching the pattern /(*UTF)\C[^\v]+\x80/ against an 8-bit string
+containing multi-code-unit characters caused bad behaviour and possibly a
+crash. This issue was fixed for other kinds of repeat in release 8.37 by change
+38, but repeating character classes were overlooked.

+
Version 8.41 05-July-2017
-------------------------


Modified: code/trunk/pcre_exec.c
===================================================================
--- code/trunk/pcre_exec.c    2018-02-19 16:35:05 UTC (rev 1724)
+++ code/trunk/pcre_exec.c    2018-02-20 15:45:01 UTC (rev 1725)
@@ -3053,7 +3053,7 @@
             {
             RMATCH(eptr, ecode, offset_top, md, eptrb, RM18);
             if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-            if (eptr-- == pp) break;        /* Stop if tried at original pos */
+            if (eptr-- <= pp) break;        /* Stop if tried at original pos */
             BACKCHAR(eptr);
             }
           }
@@ -3210,7 +3210,7 @@
           {
           RMATCH(eptr, ecode, offset_top, md, eptrb, RM21);
           if (rrc != MATCH_NOMATCH) RRETURN(rrc);
-          if (eptr-- == pp) break;        /* Stop if tried at original pos */
+          if (eptr-- <= pp) break;        /* Stop if tried at original pos */
 #ifdef SUPPORT_UTF
           if (utf) BACKCHAR(eptr);
 #endif


Modified: code/trunk/testdata/testinput5
===================================================================
--- code/trunk/testdata/testinput5    2018-02-19 16:35:05 UTC (rev 1724)
+++ code/trunk/testdata/testinput5    2018-02-20 15:45:01 UTC (rev 1725)
@@ -800,4 +800,10 @@
 /(?<=\K\x{17f})/8G+
     \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}


+/\C[^\v]+\x80/8
+    [AΏBŀC]
+
+/\C[^\d]+\x80/8
+    [AΏBŀC]
+
 /-- End of testinput5 --/


Modified: code/trunk/testdata/testoutput5
===================================================================
--- code/trunk/testdata/testoutput5    2018-02-19 16:35:05 UTC (rev 1724)
+++ code/trunk/testdata/testoutput5    2018-02-20 15:45:01 UTC (rev 1725)
@@ -1944,4 +1944,12 @@
  0: \x{17f}
  0+ 


+/\C[^\v]+\x80/8
+    [AΏBŀC]
+No match
+
+/\C[^\d]+\x80/8
+    [AΏBŀC]
+No match
+
 /-- End of testinput5 --/