[Pcre-svn] [628] code/trunk: Fix class bug when UCP but not …

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [628] code/trunk: Fix class bug when UCP but not UTF was set and all wide characters need to be
Revision: 628
          http://www.exim.org/viewvc/pcre2?view=rev&revision=628
Author:   ph10
Date:     2016-12-26 17:11:18 +0000 (Mon, 26 Dec 2016)
Log Message:
-----------
Fix class bug when UCP but not UTF was set and all wide characters need to be 
included.


Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/src/pcre2_compile.c
    code/trunk/testdata/testinput10
    code/trunk/testdata/testinput12
    code/trunk/testdata/testoutput10
    code/trunk/testdata/testoutput12-16
    code/trunk/testdata/testoutput12-32


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2016-12-24 16:25:11 UTC (rev 627)
+++ code/trunk/ChangeLog    2016-12-26 17:11:18 UTC (rev 628)
@@ -255,7 +255,11 @@
 38. Add the "-ac" command line option to pcre2test as a synonym for "-pattern
 auto_callout".


+39. In a library with Unicode support, incorrect data was compiled for a
+pattern with PCRE2_UCP set without PCRE2_UTF if a class required all wide
+characters to match (for example, /[\s[:^ascii:]]/).

+
Version 10.22 29-July-2016
--------------------------


Modified: code/trunk/src/pcre2_compile.c
===================================================================
--- code/trunk/src/pcre2_compile.c    2016-12-24 16:25:11 UTC (rev 627)
+++ code/trunk/src/pcre2_compile.c    2016-12-26 17:11:18 UTC (rev 628)
@@ -4927,9 +4927,13 @@
           automatically handled by the use of OP_CLASS or OP_NCLASS, but an
           explicit range is needed for OP_XCLASS. Setting a flag here
           causes the range to be generated later when it is known that
-          OP_XCLASS is required. */
+          OP_XCLASS is required. In the 8-bit library this is relevant only in 
+          utf mode, since no wide characters can exist otherwise. */


           default:
+#if PCRE2_CODE_UNIT_WIDTH == 8
+          if (utf)
+#endif 
           match_all_or_no_wide_chars |= local_negate;
           break;
           }
@@ -5217,6 +5221,8 @@
     all wide characters (depending on whether the whole class is or is not
     negated). This requirement is indicated by match_all_or_no_wide_chars being
     true. We do this by including an explicit range, which works in both cases.
+    This applies only in UTF and 16-bit and 32-bit non-UTF modes, since there
+    cannot be any wide characters in 8-bit non-UTF mode.


     When there *are* properties in a positive UTF-8 or any 16-bit or 32_bit
     class where \S etc is present without PCRE2_UCP, causing an extended class


Modified: code/trunk/testdata/testinput10
===================================================================
--- code/trunk/testdata/testinput10    2016-12-24 16:25:11 UTC (rev 627)
+++ code/trunk/testdata/testinput10    2016-12-26 17:11:18 UTC (rev 628)
@@ -456,4 +456,6 @@


/(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/utf

+/[\s[:^ascii:]]/B,ucp
+
# End of testinput10

Modified: code/trunk/testdata/testinput12
===================================================================
--- code/trunk/testdata/testinput12    2016-12-24 16:25:11 UTC (rev 627)
+++ code/trunk/testdata/testinput12    2016-12-26 17:11:18 UTC (rev 628)
@@ -358,4 +358,6 @@
 \= Expect no match
     123     


+/[\s[:^ascii:]]/B,ucp
+
# End of testinput12

Modified: code/trunk/testdata/testoutput10
===================================================================
--- code/trunk/testdata/testoutput10    2016-12-24 16:25:11 UTC (rev 627)
+++ code/trunk/testdata/testoutput10    2016-12-26 17:11:18 UTC (rev 628)
@@ -1567,4 +1567,12 @@
 /(*:*++++++++++++''''''''''''''''''''+''+++'+++x+++++++++++++++++++++++++++++++++++(++++++++++++++++++++:++++++%++:''''''''''''''''''''''''+++++++++++++++++++++++++++++++++++++++++++++++++++++-++++++++k+++++++''''+++'+++++++++++++++++++++++''''++++++++++++':ƿ)/utf
 Failed: error 176 at offset 259: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)


+/[\s[:^ascii:]]/B,ucp
+------------------------------------------------------------------
+        Bra
+        [\x80-\xff\p{Xsp}]
+        Ket
+        End
+------------------------------------------------------------------
+
 # End of testinput10


Modified: code/trunk/testdata/testoutput12-16
===================================================================
--- code/trunk/testdata/testoutput12-16    2016-12-24 16:25:11 UTC (rev 627)
+++ code/trunk/testdata/testoutput12-16    2016-12-26 17:11:18 UTC (rev 628)
@@ -1407,4 +1407,12 @@
     123     
 No match


+/[\s[:^ascii:]]/B,ucp
+------------------------------------------------------------------
+        Bra
+        [\x80-\xff\p{Xsp}\x{100}-\x{ffff}]
+        Ket
+        End
+------------------------------------------------------------------
+
 # End of testinput12


Modified: code/trunk/testdata/testoutput12-32
===================================================================
--- code/trunk/testdata/testoutput12-32    2016-12-24 16:25:11 UTC (rev 627)
+++ code/trunk/testdata/testoutput12-32    2016-12-26 17:11:18 UTC (rev 628)
@@ -1401,4 +1401,12 @@
     123     
 No match


+/[\s[:^ascii:]]/B,ucp
+------------------------------------------------------------------
+        Bra
+        [\x80-\xff\p{Xsp}\x{100}-\x{ffffffff}]
+        Ket
+        End
+------------------------------------------------------------------
+
 # End of testinput12