[Pcre-svn] [355] code/trunk: Make pcretest generate a single…

Página Inicial
Delete this message
Autor: Subversion repository
Data:  
Para: pcre-svn
Assunto: [Pcre-svn] [355] code/trunk: Make pcretest generate a single byte for \x{} escapes in non-UTF-8 mode.
Revision: 355
          http://vcs.pcre.org/viewvc?view=rev&revision=355
Author:   ph10
Date:     2008-07-07 18:45:23 +0100 (Mon, 07 Jul 2008)


Log Message:
-----------
Make pcretest generate a single byte for \x{} escapes in non-UTF-8 mode.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/pcretest.c
    code/trunk/testdata/testinput2
    code/trunk/testdata/testinput5
    code/trunk/testdata/testinput7
    code/trunk/testdata/testoutput2
    code/trunk/testdata/testoutput5
    code/trunk/testdata/testoutput7


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2008-07-07 16:30:33 UTC (rev 354)
+++ code/trunk/ChangeLog    2008-07-07 17:45:23 UTC (rev 355)
@@ -21,6 +21,11 @@
 4.  Caseless matching was not working for non-ASCII characters in back 
     references. For example, /(\x{de})\1/8i was not matching \x{de}\x{fe}.
     It now works when Unicode Property Support is available. 
+    
+5.  In pcretest, an escape such as \x{de} in the data was always generating
+    a UTF-8 string, even in non-UTF-8 mode. Now it generates a single byte in
+    non-UTF-8 mode. If the value is greater than 255, it gives a warning about
+    truncation.   



Version 7.7 07-May-08

Modified: code/trunk/pcretest.c
===================================================================
--- code/trunk/pcretest.c    2008-07-07 16:30:33 UTC (rev 354)
+++ code/trunk/pcretest.c    2008-07-07 17:45:23 UTC (rev 355)
@@ -1806,9 +1806,19 @@
             {
             unsigned char buff8[8];
             int ii, utn;
-            utn = ord2utf8(c, buff8);
-            for (ii = 0; ii < utn - 1; ii++) *q++ = buff8[ii];
-            c = buff8[ii];   /* Last byte */
+            if (use_utf8)
+              { 
+              utn = ord2utf8(c, buff8);
+              for (ii = 0; ii < utn - 1; ii++) *q++ = buff8[ii];
+              c = buff8[ii];   /* Last byte */
+              }
+            else
+             {
+             if (c > 255) 
+               fprintf(outfile, "** Character \\x{%x} is greater than 255 and "
+                 "UTF-8 mode is not enabled.\n"
+                 "** Truncation will probably give the wrong result.\n", c);
+             }      
             p = pt + 1;
             break;
             }


Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2008-07-07 16:30:33 UTC (rev 354)
+++ code/trunk/testdata/testinput2    2008-07-07 17:45:23 UTC (rev 355)
@@ -1988,10 +1988,10 @@
     a\rb\<anycrlf>


 /^abc./mgx<any>
-    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 \x{2028}abc8 \x{2029}abc9 JUNK
+    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 JUNK


 /abc.$/mgx<any>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc7\x{2028} abc8\x{2029} abc9
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc7 abc9


/a/<cr><any>

@@ -2147,7 +2147,7 @@
     abc\r\n\r\n


 /abc.$/mgx<anycrlf>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc7\x{2028} abc8\x{2029} abc9
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc9


 /^X/m
     XABC


Modified: code/trunk/testdata/testinput5
===================================================================
--- code/trunk/testdata/testinput5    2008-07-07 16:30:33 UTC (rev 354)
+++ code/trunk/testdata/testinput5    2008-07-07 17:45:23 UTC (rev 355)
@@ -473,4 +473,8 @@
     ** Failers
     ab  


+/(\x{de})\1/
+    \x{de}\x{de}
+    \x{123} 
+
 / End of testinput5 /


Modified: code/trunk/testdata/testinput7
===================================================================
--- code/trunk/testdata/testinput7    2008-07-07 16:30:33 UTC (rev 354)
+++ code/trunk/testdata/testinput7    2008-07-07 17:45:23 UTC (rev 355)
@@ -4151,10 +4151,10 @@
     a\rb\<any>   


 /^abc./mgx<any>
-    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 \x{2028}abc8 \x{2029}abc9 JUNK
+    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 JUNK


 /abc.$/mgx<any>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc7\x{2028} abc8\x{2029} abc9
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc9


 /^a\Rb/<bsr_unicode>
     a\nb


Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2008-07-07 16:30:33 UTC (rev 354)
+++ code/trunk/testdata/testoutput2    2008-07-07 17:45:23 UTC (rev 355)
@@ -7851,7 +7851,7 @@
 No match


 /^abc./mgx<any>
-    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 \x{2028}abc8 \x{2029}abc9 JUNK
+    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 JUNK
  0: abc1
  0: abc2
  0: abc3
@@ -7861,7 +7861,7 @@
  0: abc7


 /abc.$/mgx<any>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc7\x{2028} abc8\x{2029} abc9
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc7 abc9
  0: abc1
  0: abc2
  0: abc3
@@ -8163,7 +8163,7 @@
  0+ 


 /abc.$/mgx<anycrlf>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc7\x{2028} abc8\x{2029} abc9
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc9
  0: abc1
  0: abc4
  0: abc5


Modified: code/trunk/testdata/testoutput5
===================================================================
--- code/trunk/testdata/testoutput5    2008-07-07 16:30:33 UTC (rev 354)
+++ code/trunk/testdata/testoutput5    2008-07-07 17:45:23 UTC (rev 355)
@@ -1628,4 +1628,13 @@
     ab  
 No match


+/(\x{de})\1/
+    \x{de}\x{de}
+ 0: \xde\xde
+ 1: \xde
+    \x{123} 
+** Character \x{123} is greater than 255 and UTF-8 mode is not enabled.
+** Truncation will probably give the wrong result.
+No match
+
 / End of testinput5 /


Modified: code/trunk/testdata/testoutput7
===================================================================
--- code/trunk/testdata/testoutput7    2008-07-07 16:30:33 UTC (rev 354)
+++ code/trunk/testdata/testoutput7    2008-07-07 17:45:23 UTC (rev 355)
@@ -6805,7 +6805,7 @@
 No match


 /^abc./mgx<any>
-    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 \x{2028}abc8 \x{2029}abc9 JUNK
+    abc1 \x0aabc2 \x0babc3xx \x0cabc4 \x0dabc5xx \x0d\x0aabc6 \x85abc7 JUNK
  0: abc1
  0: abc2
  0: abc3
@@ -6815,7 +6815,7 @@
  0: abc7


 /abc.$/mgx<any>
-    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc7\x{2028} abc8\x{2029} abc9
+    abc1\x0a abc2\x0b abc3\x0c abc4\x0d abc5\x0d\x0a abc6\x85 abc9
  0: abc1
  0: abc2
  0: abc3