[Pcre-svn] [1190] code/trunk: pcretest was not diagnosing c…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [1190] code/trunk: pcretest was not diagnosing characters > 0x7fffffff in 8-bit mode.
Revision: 1190
          http://vcs.pcre.org/viewvc?view=rev&revision=1190
Author:   ph10
Date:     2012-10-30 16:49:19 +0000 (Tue, 30 Oct 2012)


Log Message:
-----------
pcretest was not diagnosing characters > 0x7fffffff in 8-bit mode.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/pcretest.c
    code/trunk/testdata/testinput15
    code/trunk/testdata/testoutput15


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2012-10-30 16:34:17 UTC (rev 1189)
+++ code/trunk/ChangeLog    2012-10-30 16:49:19 UTC (rev 1190)
@@ -137,6 +137,15 @@
     provide fast pattern matching, so several sanity checks are not performed.
     However, feature tests are still performed. The new interface provides
     1.4x speedup compared to the old one.
+    
+29. If pcre_exec() or pcre_dfa_exec() was called with a negative value for
+    the subject string length, the error given was PCRE_ERROR_BADOFFSET, which 
+    was confusing. There is now a new error PCRE_ERROR_BADLENGTH for this case.
+    
+30. In 8-bit UTF-8 mode, pcretest failed to give an error for data codepoints
+    greater than 0x7fffffff (which cannot be represented in UTF-8, even under
+    the "old" RFC 2279). Instead, it ended up passing a negative length to 
+    pcre_exec().



Version 8.31 06-July-2012

Modified: code/trunk/pcretest.c
===================================================================
--- code/trunk/pcretest.c    2012-10-30 16:34:17 UTC (rev 1189)
+++ code/trunk/pcretest.c    2012-10-30 16:49:19 UTC (rev 1190)
@@ -4730,6 +4730,12 @@
 #ifndef NOUTF
         if (use_utf)
           {
+          if (c > 0x7fffffff)
+            {
+            fprintf(outfile, "** Character \\x{%x} is greater than 0x7fffffff "
+              "and so cannot be converted to UTF-8\n", c);
+            goto NEXT_DATA;     
+            }  
           q8 += ord2utf8(c, q8);
           }
         else


Modified: code/trunk/testdata/testinput15
===================================================================
--- code/trunk/testdata/testinput15    2012-10-30 16:34:17 UTC (rev 1189)
+++ code/trunk/testdata/testinput15    2012-10-30 16:49:19 UTC (rev 1190)
@@ -423,4 +423,8 @@
 /\x{a0}+\s!/8BZT1
     \x{a0}\x20!


+/A/8
+ \x{ff000041}
+ \x{7f000041}
+
/-- End of testinput15 --/

Modified: code/trunk/testdata/testoutput15
===================================================================
--- code/trunk/testdata/testoutput15    2012-10-30 16:34:17 UTC (rev 1189)
+++ code/trunk/testdata/testoutput15    2012-10-30 16:49:19 UTC (rev 1190)
@@ -1260,4 +1260,10 @@
     \x{a0}\x20!
  0: \x{a0} !


+/A/8
+ \x{ff000041}
+** Character \x{ff000041} is greater than 0x7fffffff and so cannot be converted to UTF-8
+ \x{7f000041}
+Error -10 (bad UTF-8 string) offset=0 reason=12
+
/-- End of testinput15 --/