[Pcre-svn] [964] code/trunk: Check for overlong name in (*MA…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [964] code/trunk: Check for overlong name in (*MARK) etc.
Revision: 964
          http://vcs.pcre.org/viewvc?view=rev&revision=964
Author:   ph10
Date:     2012-05-04 14:03:39 +0100 (Fri, 04 May 2012)


Log Message:
-----------
Check for overlong name in (*MARK) etc.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/doc/pcreapi.3
    code/trunk/doc/pcrelimits.3
    code/trunk/doc/pcrepattern.3
    code/trunk/pcre_compile.c
    code/trunk/pcre_internal.h
    code/trunk/pcreposix.c
    code/trunk/testdata/testinput14
    code/trunk/testdata/testinput17
    code/trunk/testdata/testoutput14
    code/trunk/testdata/testoutput17


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/ChangeLog    2012-05-04 13:03:39 UTC (rev 964)
@@ -107,6 +107,9 @@


 28. To catch bugs like 27 using valgrind, when pcretest is asked to specify an
     ovector size, it uses memory at the end of the block that it has got.
+    
+29. Check for an overlong MARK name and give an error at compile time. The 
+    limit is 255 for the 8-bit library and 65535 for the 16-bit library.



Version 8.30 04-February-2012

Modified: code/trunk/doc/pcreapi.3
===================================================================
--- code/trunk/doc/pcreapi.3    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/doc/pcreapi.3    2012-05-04 13:03:39 UTC (rev 964)
@@ -1,4 +1,4 @@
-.TH PCREAPI 3 "19 April 2012" "PCRE 8.31"
+.TH PCREAPI 3 "04 May 2012" "PCRE 8.31"
 .SH NAME
 PCRE - Perl-compatible regular expressions
 .sp
@@ -926,6 +926,7 @@
   72  too many forward references
   73  disallowed Unicode code point (>= 0xd800 && <= 0xdfff)
   74  invalid UTF-16 string (specifically UTF-16)
+  75  name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN) 
 .sp
 The numbers 32 and 10000 in errors 48 and 49 are defaults; different values may
 be used if the limits were changed when PCRE was built.
@@ -2665,6 +2666,6 @@
 .rs
 .sp
 .nf
-Last updated: 19 April 2012
+Last updated: 04 May 2012
 Copyright (c) 1997-2012 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcrelimits.3
===================================================================
--- code/trunk/doc/pcrelimits.3    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/doc/pcrelimits.3    2012-05-04 13:03:39 UTC (rev 964)
@@ -1,4 +1,4 @@
-.TH PCRELIMITS 3 "13 January 2012" "PCRE 8.30"
+.TH PCRELIMITS 3 "04 May 2012" "PCRE 8.30"
 .SH NAME
 PCRE - Perl-compatible regular expressions
 .SH "SIZE AND OTHER LIMITATIONS"
@@ -32,6 +32,9 @@
 The maximum length of name for a named subpattern is 32 characters, and the
 maximum number of named subpatterns is 10000.
 .P
+The maximum length of a name in a (*MARK), (*PRUNE), (*SKIP), or (*THEN) verb
+is 255 for the 8-bit library and 65535 for the 16-bit library.
+.P
 The maximum length of a subject string is the largest positive number that an
 integer variable can hold. However, when using the traditional matching
 function, PCRE uses recursion to handle subpatterns and indefinite repetition.
@@ -58,6 +61,6 @@
 .rs
 .sp
 .nf
-Last updated: 08 January 2012
+Last updated: 04 May 2012
 Copyright (c) 1997-2012 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcrepattern.3
===================================================================
--- code/trunk/doc/pcrepattern.3    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/doc/pcrepattern.3    2012-05-04 13:03:39 UTC (rev 964)
@@ -1,4 +1,4 @@
-.TH PCREPATTERN 3 "14 April 2012" "PCRE 8.31"
+.TH PCREPATTERN 3 "04 May 2012" "PCRE 8.31"
 .SH NAME
 PCRE - Perl-compatible regular expressions
 .SH "PCRE REGULAR EXPRESSION DETAILS"
@@ -2605,10 +2605,11 @@
 parenthesis followed by an asterisk. They are generally of the form
 (*VERB) or (*VERB:NAME). Some may take either form, with differing behaviour,
 depending on whether or not an argument is present. A name is any sequence of
-characters that does not include a closing parenthesis. If the name is empty,
-that is, if the closing parenthesis immediately follows the colon, the effect
-is as if the colon were not there. Any number of these verbs may occur in a
-pattern.
+characters that does not include a closing parenthesis. The maximum length of
+name is 255 in the 8-bit library and 65535 in the 16-bit library. If the name
+is empty, that is, if the closing parenthesis immediately follows the colon,
+the effect is as if the colon were not there. Any number of these verbs may
+occur in a pattern.
 .
 .
 .\" HTML <a name="nooptimize"></a>
@@ -2910,6 +2911,6 @@
 .rs
 .sp
 .nf
-Last updated: 14 April 2012
+Last updated: 04 May 2012
 Copyright (c) 1997-2012 University of Cambridge.
 .fi


Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/pcre_compile.c    2012-05-04 13:03:39 UTC (rev 964)
@@ -489,6 +489,8 @@
   "too many forward references\0"
   "disallowed Unicode code point (>= 0xd800 && <= 0xdfff)\0"
   "invalid UTF-16 string\0"
+  /* 75 */
+  "name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN)\0"  
   ;


 /* Table to identify digits and hex digits. This is used when compiling
@@ -5591,7 +5593,7 @@
       ptr++;
       while (MAX_255(*ptr) && (cd->ctypes[*ptr] & ctype_letter) != 0) ptr++;
       namelen = (int)(ptr - name);
-
+      
       /* It appears that Perl allows any characters whatsoever, other than
       a closing parenthesis, to appear in arguments, so we no longer insist on
       letters, digits, and underscores. */
@@ -5601,6 +5603,11 @@
         arg = ++ptr;
         while (*ptr != 0 && *ptr != CHAR_RIGHT_PARENTHESIS) ptr++;
         arglen = (int)(ptr - arg);
+        if (arglen > (int)MAX_MARK)
+          {
+          *errorcodeptr = ERR75;
+          goto FAILED;
+          }     
         }


       if (*ptr != CHAR_RIGHT_PARENTHESIS)


Modified: code/trunk/pcre_internal.h
===================================================================
--- code/trunk/pcre_internal.h    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/pcre_internal.h    2012-05-04 13:03:39 UTC (rev 964)
@@ -523,6 +523,11 @@


#define PUT2INC(a,n,d) PUT2(a,n,d), a += IMM2_SIZE

+/* The maximum length of a MARK name is currently one data unit; it may be
+changed in future to be a fixed number of bytes or to depend on LINK_SIZE. */
+
+#define MAX_MARK ((1 << (sizeof(pcre_uchar)*8)) - 1)
+
 /* When UTF encoding is being used, a character is no longer just a single
 character. The macros for character handling generate simple sequences when
 used in character-mode, and more complicated ones for UTF characters.
@@ -1940,7 +1945,7 @@
        ERR40, ERR41, ERR42, ERR43, ERR44, ERR45, ERR46, ERR47, ERR48, ERR49,
        ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
        ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
-       ERR70, ERR71, ERR72, ERR73, ERR74, ERRCOUNT };
+       ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERRCOUNT };


/* JIT compiling modes. The function list is indexed by them. */
enum { JIT_COMPILE, JIT_PARTIAL_SOFT_COMPILE, JIT_PARTIAL_HARD_COMPILE,

Modified: code/trunk/pcreposix.c
===================================================================
--- code/trunk/pcreposix.c    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/pcreposix.c    2012-05-04 13:03:39 UTC (rev 964)
@@ -158,7 +158,9 @@
   REG_BADPAT,  /* \N is not supported in a class */
   REG_BADPAT,  /* too many forward references */
   REG_BADPAT,  /* disallowed UTF-8/16 code point (>= 0xd800 && <= 0xdfff) */
-  REG_BADPAT   /* invalid UTF-16 string (should not occur) */
+  REG_BADPAT,  /* invalid UTF-16 string (should not occur) */
+  /* 75 */
+  REG_BADPAT   /* overlong MARK name */  
 };


/* Table of texts corresponding to POSIX error codes */

Modified: code/trunk/testdata/testinput14
===================================================================
--- code/trunk/testdata/testinput14    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/testdata/testinput14    2012-05-04 13:03:39 UTC (rev 964)
@@ -314,4 +314,10 @@


/\777/I

+/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/K
+    XX
+     
+/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/K
+    XX
+
 /-- End of testinput14 --/


Modified: code/trunk/testdata/testinput17
===================================================================
--- code/trunk/testdata/testinput17    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/testdata/testinput17    2012-05-04 13:03:39 UTC (rev 964)
@@ -280,4 +280,10 @@


/[^\x{100}]*[^\x{1000}]+[^\x{ffff}]??[^\x{8000}]{4,}[^\x{7fff}]{2,9}?[^\x{100}]{5,6}+/BZi

+/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/K
+    XX
+     
+/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/K
+    XX
+
 /-- End of testinput17 --/


Modified: code/trunk/testdata/testoutput14
===================================================================
--- code/trunk/testdata/testoutput14    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/testdata/testoutput14    2012-05-04 13:03:39 UTC (rev 964)
@@ -453,4 +453,12 @@
 /\777/I
 Failed: octal value is greater than \377 in 8-bit non-UTF-8 mode at offset 3


+/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/K
+Failed: name is too long in (*MARK), (*PRUNE), (*SKIP), or (*THEN) at offset 259
+
+/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/K
+    XX
+ 0: XX
+MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
+
 /-- End of testinput14 --/


Modified: code/trunk/testdata/testoutput17
===================================================================
--- code/trunk/testdata/testoutput17    2012-04-21 18:06:31 UTC (rev 963)
+++ code/trunk/testdata/testoutput17    2012-05-04 13:03:39 UTC (rev 964)
@@ -506,4 +506,14 @@
         End
 ------------------------------------------------------------------


+/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF)XX/K
+    XX
+ 0: XX
+MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF
+     
+/(*:0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE)XX/K
+    XX
+ 0: XX
+MK: 0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDE
+
 /-- End of testinput17 --/