[Pcre-svn] [340] code/trunk: Fix incorrect error for pattern…

Página Inicial
Delete this message
Autor: Subversion repository
Data:  
Para: pcre-svn
Assunto: [Pcre-svn] [340] code/trunk: Fix incorrect error for patterns like /(?2)[]a()b](abc)/
Revision: 340
          http://vcs.pcre.org/viewvc?view=rev&revision=340
Author:   ph10
Date:     2008-04-18 21:00:21 +0100 (Fri, 18 Apr 2008)


Log Message:
-----------
Fix incorrect error for patterns like /(?2)[]a()b](abc)/

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/pcre_compile.c
    code/trunk/testdata/testinput2
    code/trunk/testdata/testoutput2


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2008-04-13 15:14:34 UTC (rev 339)
+++ code/trunk/ChangeLog    2008-04-18 20:00:21 UTC (rev 340)
@@ -62,6 +62,15 @@
     (a) A lone ] character is dis-allowed (Perl treats it as data).
     (b) A back reference to an unmatched subpattern matches an empty string 
         (Perl fails the current match path).
+        
+14. A pattern such as /(?2)[]a()b](abc)/ which had a forward reference to a 
+    non-existent subpattern following a character class starting with ']' and
+    containing () gave an internal compiling error instead of "reference to
+    non-existent subpattern". Fortunately, when the pattern did exist, the
+    compiled code was correct. (When scanning forwards to check for the 
+    existencd of the subpattern, it was treating the data ']' as terminating
+    the class, so got the count wrong. When actually compiling, the reference 
+    was subsequently set up correctly.)



Version 7.6 28-Jan-08

Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c    2008-04-13 15:14:34 UTC (rev 339)
+++ code/trunk/pcre_compile.c    2008-04-18 20:00:21 UTC (rev 340)
@@ -1008,10 +1008,33 @@
     continue;
     }


- /* Skip over character classes */
+ /* Skip over character classes; this logic must be similar to the way they
+ are handled for real. If the first character is '^', skip it. Also, if the
+ first few characters (either before or after ^) are \Q\E or \E we skip them
+ too. This makes for compatibility with Perl. */

   if (*ptr == '[')
     {
+    BOOL negate_class = FALSE;
+    for (;;)
+      {
+      int c = *(++ptr);
+      if (c == '\\')
+        {
+        if (ptr[1] == 'E') ptr++;
+          else if (strncmp((const char *)ptr+1, "Q\\E", 3) == 0) ptr += 3;
+            else break;
+        }
+      else if (!negate_class && c == '^')
+        negate_class = TRUE;
+      else break;
+      }
+
+    /* If the next character is ']', it is a data character that must be
+    skipped. */
+    
+    if (ptr[1] == ']') ptr++;  
+ 
     while (*(++ptr) != ']')
       {
       if (*ptr == 0) return -1;


Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2008-04-13 15:14:34 UTC (rev 339)
+++ code/trunk/testdata/testinput2    2008-04-18 20:00:21 UTC (rev 340)
@@ -2667,4 +2667,29 @@
 /TA]/<JS>
     The ACTA] comes 


+/(?2)[]a()b](abc)/
+    abcbabc
+
+/(?2)[^]a()b](abc)/
+    abcbabc
+
+/(?1)[]a()b](abc)/
+    abcbabc
+    ** Failers 
+    abcXabc
+
+/(?1)[^]a()b](abc)/
+    abcXabc
+    ** Failers 
+    abcbabc
+
+/(?2)[]a()b](abc)(xyz)/
+    xyzbabcxyz
+
+/(?&N)[]a(?<N>)](?<M>abc)/
+   abc<abc
+
+/(?&N)[]a(?<N>)](abc)/
+   abc<abc
+
 / End of testinput2 /


Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2008-04-13 15:14:34 UTC (rev 339)
+++ code/trunk/testdata/testoutput2    2008-04-18 20:00:21 UTC (rev 340)
@@ -9545,4 +9545,40 @@
 /TA]/<JS>
 Failed: ] is an invalid data character in JavaScript compatibility mode at offset 2


+/(?2)[]a()b](abc)/
+Failed: reference to non-existent subpattern at offset 3
+
+/(?2)[^]a()b](abc)/
+Failed: reference to non-existent subpattern at offset 3
+
+/(?1)[]a()b](abc)/
+    abcbabc
+ 0: abcbabc
+ 1: abc
+    ** Failers 
+No match
+    abcXabc
+No match
+
+/(?1)[^]a()b](abc)/
+    abcXabc
+ 0: abcXabc
+ 1: abc
+    ** Failers 
+No match
+    abcbabc
+No match
+
+/(?2)[]a()b](abc)(xyz)/
+    xyzbabcxyz
+ 0: xyzbabcxyz
+ 1: abc
+ 2: xyz
+
+/(?&N)[]a(?<N>)](?<M>abc)/
+Failed: reference to non-existent subpattern at offset 4
+
+/(?&N)[]a(?<N>)](abc)/
+Failed: reference to non-existent subpattern at offset 4
+
 / End of testinput2 /