[Pcre-svn] [1454] code/trunk: Implement pcre_stack_guard.

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [1454] code/trunk: Implement pcre_stack_guard.
Revision: 1454
          http://vcs.pcre.org/viewvc?view=rev&revision=1454
Author:   ph10
Date:     2014-02-09 18:55:03 +0000 (Sun, 09 Feb 2014)


Log Message:
-----------
Implement pcre_stack_guard.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/doc/pcreapi.3
    code/trunk/doc/pcretest.1
    code/trunk/pcre.h.in
    code/trunk/pcre_compile.c
    code/trunk/pcre_globals.c
    code/trunk/pcre_internal.h
    code/trunk/pcreposix.c
    code/trunk/pcretest.c
    code/trunk/testdata/testinput2
    code/trunk/testdata/testoutput2


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/ChangeLog    2014-02-09 18:55:03 UTC (rev 1454)
@@ -99,6 +99,12 @@
 20. The fast forward newline mechanism could enter to an infinite loop on
     certain invalid UTF-8 input. Although we don't support these cases
     this issue can be fixed by a performance optimization.
+    
+21. Change 33 of 8.34 is not sufficient to ensure stack safety because it does
+    not take account if existing stack usage. There is now a new global 
+    variable called pcre_stack_guard that can be set to point to an external 
+    function to check stack availability. It is called at the start of 
+    processing every parenthesized group.  



Version 8.34 15-December-2013

Modified: code/trunk/doc/pcreapi.3
===================================================================
--- code/trunk/doc/pcreapi.3    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/doc/pcreapi.3    2014-02-09 18:55:03 UTC (rev 1454)
@@ -1,4 +1,4 @@
-.TH PCREAPI 3 "03 January 2014" "PCRE 8.35"
+.TH PCREAPI 3 "09 February 2014" "PCRE 8.35"
 .SH NAME
 PCRE - Perl-compatible regular expressions
 .sp
@@ -116,6 +116,8 @@
 .B void (*pcre_stack_free)(void *);
 .sp
 .B int (*pcre_callout)(pcre_callout_block *);
+.sp
+.B int (*pcre_stack_guard)(void);
 .fi
 .
 .
@@ -286,6 +288,14 @@
 \fBpcrecallout\fP
 .\"
 documentation.
+.P
+The global variable \fBpcre_stack_guard\fP initially contains NULL. It can be 
+set by the caller to a function that is called by PCRE whenever it starts 
+to compile a parenthesized part of a pattern. When parentheses are nested, PCRE 
+uses recursive function calls, which use up the system stack. This function is 
+provided so that applications with restricted stacks can force a compilation 
+error if the stack runs out. The function should return zero if all is well, or 
+non-zero to force an error.
 .
 .
 .\" HTML <a name="newlines"></a>
@@ -337,7 +347,8 @@
 The PCRE functions can be used in multi-threading applications, with the
 proviso that the memory management functions pointed to by \fBpcre_malloc\fP,
 \fBpcre_free\fP, \fBpcre_stack_malloc\fP, and \fBpcre_stack_free\fP, and the
-callout function pointed to by \fBpcre_callout\fP, are shared by all threads.
+callout and stack-checking functions pointed to by \fBpcre_callout\fP and 
+\fBpcre_stack_guard\fP, are shared by all threads.
 .P
 The compiled form of a regular expression is not altered during matching, so
 the same compiled pattern can safely be used by several threads at once.
@@ -465,7 +476,10 @@
 The output is a long integer that gives the maximum depth of nesting of
 parentheses (of any kind) in a pattern. This limit is imposed to cap the amount
 of system stack used when a pattern is compiled. It is specified when PCRE is
-built; the default is 250.
+built; the default is 250. This limit does not take into account the stack that 
+may already be used by the calling application. For finer control over 
+compilation stack usage, you can set a pointer to an external checking function
+in \fBpcre_stack_guard\fP.
 .sp
   PCRE_CONFIG_MATCH_LIMIT
 .sp
@@ -991,6 +1005,8 @@
   81  missing opening brace after \eo
   82  parentheses are too deeply nested
   83  invalid range in character class
+  84  group name must start with a non-digit
+  85  parentheses are too deeply nested (stack check)  
 .sp
 The numbers 32 and 10000 in errors 48 and 49 are defaults; different values may
 be used if the limits were changed when PCRE was built.
@@ -2898,6 +2914,6 @@
 .rs
 .sp
 .nf
-Last updated: 03 January 2014
+Last updated: 09 February 2014
 Copyright (c) 1997-2014 University of Cambridge.
 .fi


Modified: code/trunk/doc/pcretest.1
===================================================================
--- code/trunk/doc/pcretest.1    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/doc/pcretest.1    2014-02-09 18:55:03 UTC (rev 1454)
@@ -1,4 +1,4 @@
-.TH PCRETEST 1 "17 January 2014" "PCRE 8.35"
+.TH PCRETEST 1 "09 February 2014" "PCRE 8.35"
 .SH NAME
 pcretest - a program for testing Perl-compatible regular expressions.
 .SH SYNOPSIS
@@ -333,6 +333,7 @@
   \fB/N\fP              set PCRE_NO_AUTO_CAPTURE
   \fB/O\fP              set PCRE_NO_AUTO_POSSESS
   \fB/P\fP              use the POSIX wrapper
+  \fB/Q\fP              test external stack check function 
   \fB/S\fP              study the pattern after compilation
   \fB/s\fP              set PCRE_DOTALL
   \fB/T\fP              select character tables
@@ -519,6 +520,15 @@
 successfully studied with the PCRE_STUDY_JIT_COMPILE option, the size of the
 JIT compiled code is also output.
 .P
+The \fB/Q\fP modifier is used to test the use of \fBpcre_stack_guard\fP. It 
+must be followed by '0' or '1', specifying the return code to be given from an 
+external function that is passed to PCRE and used for stack checking during 
+compilation (see the
+.\" HREF
+\fBpcreapi\fP
+.\"
+documentation for details). 
+.P
 The \fB/S\fP modifier causes \fBpcre[16|32]_study()\fP to be called after the
 expression has been compiled, and the results used when the expression is
 matched. There are a number of qualifying characters that may follow \fB/S\fP.
@@ -1141,6 +1151,6 @@
 .rs
 .sp
 .nf
-Last updated: 17 January 2014
+Last updated: 09 February 2014
 Copyright (c) 1997-2014 University of Cambridge.
 .fi


Modified: code/trunk/pcre.h.in
===================================================================
--- code/trunk/pcre.h.in    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/pcre.h.in    2014-02-09 18:55:03 UTC (rev 1454)
@@ -5,7 +5,7 @@
 /* This is the public header file for the PCRE library, to be #included by
 applications that call the PCRE functions.


-           Copyright (c) 1997-2013 University of Cambridge
+           Copyright (c) 1997-2014 University of Cambridge


-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -491,36 +491,42 @@
PCRE_EXP_DECL void *(*pcre_stack_malloc)(size_t);
PCRE_EXP_DECL void (*pcre_stack_free)(void *);
PCRE_EXP_DECL int (*pcre_callout)(pcre_callout_block *);
+PCRE_EXP_DECL int (*pcre_stack_guard)(void);

PCRE_EXP_DECL void *(*pcre16_malloc)(size_t);
PCRE_EXP_DECL void (*pcre16_free)(void *);
PCRE_EXP_DECL void *(*pcre16_stack_malloc)(size_t);
PCRE_EXP_DECL void (*pcre16_stack_free)(void *);
PCRE_EXP_DECL int (*pcre16_callout)(pcre16_callout_block *);
+PCRE_EXP_DECL int (*pcre16_stack_guard)(void);

PCRE_EXP_DECL void *(*pcre32_malloc)(size_t);
PCRE_EXP_DECL void (*pcre32_free)(void *);
PCRE_EXP_DECL void *(*pcre32_stack_malloc)(size_t);
PCRE_EXP_DECL void (*pcre32_stack_free)(void *);
PCRE_EXP_DECL int (*pcre32_callout)(pcre32_callout_block *);
+PCRE_EXP_DECL int (*pcre32_stack_guard)(void);
#else /* VPCOMPAT */
PCRE_EXP_DECL void *pcre_malloc(size_t);
PCRE_EXP_DECL void pcre_free(void *);
PCRE_EXP_DECL void *pcre_stack_malloc(size_t);
PCRE_EXP_DECL void pcre_stack_free(void *);
PCRE_EXP_DECL int pcre_callout(pcre_callout_block *);
+PCRE_EXP_DECL int pcre_stack_guard(void);

PCRE_EXP_DECL void *pcre16_malloc(size_t);
PCRE_EXP_DECL void pcre16_free(void *);
PCRE_EXP_DECL void *pcre16_stack_malloc(size_t);
PCRE_EXP_DECL void pcre16_stack_free(void *);
PCRE_EXP_DECL int pcre16_callout(pcre16_callout_block *);
+PCRE_EXP_DECL int pcre16_stack_guard(void);

PCRE_EXP_DECL void *pcre32_malloc(size_t);
PCRE_EXP_DECL void pcre32_free(void *);
PCRE_EXP_DECL void *pcre32_stack_malloc(size_t);
PCRE_EXP_DECL void pcre32_stack_free(void *);
PCRE_EXP_DECL int pcre32_callout(pcre32_callout_block *);
+PCRE_EXP_DECL int pcre32_stack_guard(void);
#endif /* VPCOMPAT */

/* User defined callback which provides a stack just before the match starts. */

Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/pcre_compile.c    2014-02-09 18:55:03 UTC (rev 1454)
@@ -547,6 +547,8 @@
   "parentheses are too deeply nested\0"
   "invalid range in character class\0"
   "group name must start with a non-digit\0"
+  /* 85 */
+  "parentheses are too deeply nested (stack check)\0"  
   ;


/* Table to identify digits and hex digits. This is used when compiling
@@ -8033,6 +8035,16 @@
unsigned int max_bracount;
branch_chain bc;

+/* If set, call the external function that checks for stack availability. */
+
+if (PUBL(stack_guard) != NULL && PUBL(stack_guard)())
+ {
+ *errorcodeptr= ERR85;
+ return FALSE;
+ }
+
+/* Miscellaneous initialization */
+
bc.outer = bcptr;
bc.current_branch = code;


Modified: code/trunk/pcre_globals.c
===================================================================
--- code/trunk/pcre_globals.c    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/pcre_globals.c    2014-02-09 18:55:03 UTC (rev 1454)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.


                        Written by Philip Hazel
-           Copyright (c) 1997-2012 University of Cambridge
+           Copyright (c) 1997-2014 University of Cambridge


-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -72,6 +72,7 @@
PCRE_EXP_DATA_DEFN void *(*PUBL(stack_malloc))(size_t) = LocalPcreMalloc;
PCRE_EXP_DATA_DEFN void (*PUBL(stack_free))(void *) = LocalPcreFree;
PCRE_EXP_DATA_DEFN int (*PUBL(callout))(PUBL(callout_block) *) = NULL;
+PCRE_EXP_DATA_DEFN int (*PUBL(stack_guard))(void) = NULL;

#elif !defined VPCOMPAT
PCRE_EXP_DATA_DEFN void *(*PUBL(malloc))(size_t) = malloc;
@@ -79,6 +80,7 @@
PCRE_EXP_DATA_DEFN void *(*PUBL(stack_malloc))(size_t) = malloc;
PCRE_EXP_DATA_DEFN void (*PUBL(stack_free))(void *) = free;
PCRE_EXP_DATA_DEFN int (*PUBL(callout))(PUBL(callout_block) *) = NULL;
+PCRE_EXP_DATA_DEFN int (*PUBL(stack_guard))(void) = NULL;
#endif

/* End of pcre_globals.c */

Modified: code/trunk/pcre_internal.h
===================================================================
--- code/trunk/pcre_internal.h    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/pcre_internal.h    2014-02-09 18:55:03 UTC (rev 1454)
@@ -2281,7 +2281,7 @@
        ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
        ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
        ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79,
-       ERR80, ERR81, ERR82, ERR83, ERR84, ERRCOUNT };
+       ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERRCOUNT };


/* JIT compiling modes. The function list is indexed by them. */


Modified: code/trunk/pcreposix.c
===================================================================
--- code/trunk/pcreposix.c    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/pcreposix.c    2014-02-09 18:55:03 UTC (rev 1454)
@@ -6,7 +6,7 @@
 and semantics are as close as possible to those of the Perl 5 language.


                        Written by Philip Hazel
-           Copyright (c) 1997-2012 University of Cambridge
+           Copyright (c) 1997-2014 University of Cambridge


-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
@@ -170,7 +170,9 @@
REG_BADPAT, /* missing opening brace after \o */
REG_BADPAT, /* parentheses too deeply nested */
REG_BADPAT, /* invalid range in character class */
- REG_BADPAT /* group name must start with a non-digit */
+ REG_BADPAT, /* group name must start with a non-digit */
+ /* 85 */
+ REG_BADPAT /* parentheses too deeply nested (stack check) */
};

/* Table of texts corresponding to POSIX error codes */

Modified: code/trunk/pcretest.c
===================================================================
--- code/trunk/pcretest.c    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/pcretest.c    2014-02-09 18:55:03 UTC (rev 1454)
@@ -233,6 +233,9 @@
 #define SET_PCRE_CALLOUT8(callout) \
   pcre_callout = callout


+#define SET_PCRE_STACK_GUARD8(stack_guard) \
+  pcre_stack_guard = stack_guard
+
 #define PCRE_ASSIGN_JIT_STACK8(extra, callback, userdata) \
    pcre_assign_jit_stack(extra, callback, userdata)


@@ -317,6 +320,9 @@
#define SET_PCRE_CALLOUT16(callout) \
pcre16_callout = (int (*)(pcre16_callout_block *))callout

+#define SET_PCRE_STACK_GUARD16(stack_guard) \
+  pcre16_stack_guard = (int (*)(void))stack_guard
+
 #define PCRE_ASSIGN_JIT_STACK16(extra, callback, userdata) \
   pcre16_assign_jit_stack((pcre16_extra *)extra, \
     (pcre16_jit_callback)callback, userdata)
@@ -406,6 +412,9 @@
 #define SET_PCRE_CALLOUT32(callout) \
   pcre32_callout = (int (*)(pcre32_callout_block *))callout


+#define SET_PCRE_STACK_GUARD32(stack_guard) \
+  pcre32_stack_guard = (int (*)(void))stack_guard
+
 #define PCRE_ASSIGN_JIT_STACK32(extra, callback, userdata) \
   pcre32_assign_jit_stack((pcre32_extra *)extra, \
     (pcre32_jit_callback)callback, userdata)
@@ -533,6 +542,14 @@
   else \
     SET_PCRE_CALLOUT8(callout)


+#define SET_PCRE_STACK_GUARD(stack_guard) \
+  if (pcre_mode == PCRE32_MODE) \
+    SET_PCRE_STACK_GUARD32(stack_guard); \
+  else if (pcre_mode == PCRE16_MODE) \
+    SET_PCRE_STACK_GUARD16(stack_guard); \
+  else \
+    SET_PCRE_STACK_GUARD8(stack_guard)
+
 #define STRLEN(p) (pcre_mode == PCRE32_MODE ? STRLEN32(p) : pcre_mode == PCRE16_MODE ? STRLEN16(p) : STRLEN8(p))


 #define PCRE_ASSIGN_JIT_STACK(extra, callback, userdata) \
@@ -756,6 +773,12 @@
   else \
     G(SET_PCRE_CALLOUT,BITTWO)(callout)


+#define SET_PCRE_STACK_GUARD(stack_guard) \
+  if (pcre_mode == G(G(PCRE,BITONE),_MODE)) \
+    G(SET_PCRE_STACK_GUARD,BITONE)(stack_guard); \
+  else \
+    G(SET_PCRE_STACK_GUARD,BITTWO)(stack_guard)
+
 #define STRLEN(p) ((pcre_mode == G(G(PCRE,BITONE),_MODE)) ? \
   G(STRLEN,BITONE)(p) : G(STRLEN,BITTWO)(p))


@@ -897,6 +920,7 @@
 #define PCHARSV                   PCHARSV8
 #define READ_CAPTURE_NAME         READ_CAPTURE_NAME8
 #define SET_PCRE_CALLOUT          SET_PCRE_CALLOUT8
+#define SET_PCRE_STACK_GUARD      SET_PCRE_STACK_GUARD8
 #define STRLEN                    STRLEN8
 #define PCRE_ASSIGN_JIT_STACK     PCRE_ASSIGN_JIT_STACK8
 #define PCRE_COMPILE              PCRE_COMPILE8
@@ -927,6 +951,7 @@
 #define PCHARSV                   PCHARSV16
 #define READ_CAPTURE_NAME         READ_CAPTURE_NAME16
 #define SET_PCRE_CALLOUT          SET_PCRE_CALLOUT16
+#define SET_PCRE_STACK_GUARD      SET_PCRE_STACK_GUARD16
 #define STRLEN                    STRLEN16
 #define PCRE_ASSIGN_JIT_STACK     PCRE_ASSIGN_JIT_STACK16
 #define PCRE_COMPILE              PCRE_COMPILE16
@@ -957,6 +982,7 @@
 #define PCHARSV                   PCHARSV32
 #define READ_CAPTURE_NAME         READ_CAPTURE_NAME32
 #define SET_PCRE_CALLOUT          SET_PCRE_CALLOUT32
+#define SET_PCRE_STACK_GUARD      SET_PCRE_STACK_GUARD32
 #define STRLEN                    STRLEN32
 #define PCRE_ASSIGN_JIT_STACK     PCRE_ASSIGN_JIT_STACK32
 #define PCRE_COMPILE              PCRE_COMPILE32
@@ -1015,6 +1041,7 @@
 static int jit_was_used;
 static int locale_set = 0;
 static int show_malloc;
+static int stack_guard_return;
 static int use_utf;
 static const unsigned char *last_callout_mark = NULL;


@@ -2201,6 +2228,18 @@


 /*************************************************
+*            Stack guard function                *
+*************************************************/
+
+/* Called from PCRE when set in pcre_stack_guard. We give an error (non-zero)
+return when a count overflows. */
+
+static int stack_guard(void)
+{
+return stack_guard_return;
+}
+
+/*************************************************
 *              Callout function                  *
 *************************************************/


@@ -3445,6 +3484,7 @@

use_utf = 0;
debug_lengths = 1;
+ SET_PCRE_STACK_GUARD(NULL);

   if (extend_inputline(infile, buffer, "  re> ") == NULL) break;
   if (infile != stdin) fprintf(outfile, "%s", (char *)buffer);
@@ -3745,6 +3785,21 @@
       case 'P': do_posix = 1; break;
 #endif


+      case 'Q':
+      switch (*pp)
+        {
+        case '0': 
+        case '1':
+        stack_guard_return = *pp++ - '0';
+        break;  
+
+        default:
+        fprintf(outfile, "** Missing 0 or 1 after /Q\n");
+        goto SKIP_DATA;
+        }
+      SET_PCRE_STACK_GUARD(stack_guard);
+      break;
+
       case 'S':
       do_study = 1;
       for (;;)
@@ -5198,7 +5253,7 @@
           if (count * 2 > use_size_offsets) count = use_size_offsets/2;
           }


-        /* Output the captured substrings. Note that, for the matched string, 
+        /* Output the captured substrings. Note that, for the matched string,
         the use of \K in an assertion can make the start later than the end. */


         for (i = 0; i < count * 2; i += 2)
@@ -5217,23 +5272,23 @@
             {
             int start = use_offsets[i];
             int end = use_offsets[i+1];
-               
+
             if (start > end)
               {
               start = use_offsets[i+1];
               end = use_offsets[i];
-              fprintf(outfile, "Start of matched string is beyond its end - " 
-                "displaying from end to start.\n"); 
-              }  
- 
+              fprintf(outfile, "Start of matched string is beyond its end - "
+                "displaying from end to start.\n");
+              }
+
             fprintf(outfile, "%2d: ", i/2);
             PCHARSV(bptr, start, end - start, outfile);
             if (verify_jit && jit_was_used) fprintf(outfile, " (JIT)");
             fprintf(outfile, "\n");
-            
+
             /* Note: don't use the start/end variables here because we want to
             show the text from what is reported as the end. */
-             
+
             if (do_showcaprest || (i == 0 && do_showrest))
               {
               fprintf(outfile, "%2d+ ", i/2);


Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/testdata/testinput2    2014-02-09 18:55:03 UTC (rev 1454)
@@ -4050,5 +4050,13 @@


 /abcd/f<lf>
     xx\nxabcd
+    
+/ -- Test stack check external calls --/ 


+/(((((a)))))/Q0
+
+/(((((a)))))/Q1
+
+/(((((a)))))/Q
+
/-- End of testinput2 --/

Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2014-01-30 06:10:21 UTC (rev 1453)
+++ code/trunk/testdata/testoutput2    2014-02-09 18:55:03 UTC (rev 1454)
@@ -14134,5 +14134,15 @@
 /abcd/f<lf>
     xx\nxabcd
 No match
+    
+/ -- Test stack check external calls --/ 


+/(((((a)))))/Q0
+
+/(((((a)))))/Q1
+Failed: parentheses are too deeply nested (stack check) at offset 0
+
+/(((((a)))))/Q
+** Missing 0 or 1 after /Q
+
/-- End of testinput2 --/