[pcre-dev] [PATCH] PCRE2 on Windows

Top Page
Delete this message
Author: Daniel Richard G.
Date:  
To: pcre-dev
Subject: [pcre-dev] [PATCH] PCRE2 on Windows
Hello list,

I have been building and testing PCRE2 on Windows, using somewhat older
versions of Visual Studio (required by my employer for customer-system
compatibility). I have found a few issues and have attached a patch
(against r929) for a couple of them.

In patch order:

* RunTest.bat: This batch script passes an argument to -error in test 2
that does not match that in the RunTest shell script, so test 2 was
always failing on Windows

* pcre2_dfa_match.c: Several invocations of pcre2test were crashing due
to _chkstk(). Meaning, the program ran out of stack space. I tracked
the crash down to this file. To make a long story short, all those big
local_offsets[] and local_workspace[] arrays are the source of the
problem. Reducing their size from 1000 to 200 elements allows the test
to run to completion.

Now, this is probably not the most desirable fix, but I suspect some
refactoring may be in order here. Just making these arrays heap-
allocated would be tricky (as this would likely have performance
implications, and all the conditional return-statements make
avoiding memory leaks not so trivial), so a more invasive solution
might be needed.

* pcre2grep.c: Needs the same snprintf() workaround as seen elsewhere
in the tree

* Also added some logic so that non-C99 snprintf() works correctly
(returns -1 in the case of overflow)


With all that, there are a few remaining issues:

* The code currently cannot be compiled without a stdint.h header, which
is available only in relatively recent versions of Visual Studio.
However, this portable and permissively-licensed implementation of the
header worked without issue:

    http://www.azillionmonkeys.com/qed/pstdint.h


Just rename it and drop it into the top level of the build tree.

* Test 6 still crashes due to running out of stack space, only in this
case, it's a very deep call stack that is the issue. I had to add
/STACK:3000000 to the linker invocation for this issue to go away.

* Test 6 also fails because line 4932 of testinput6 contains a Ctrl-Z
character, which on DOS/Windows indicates EOF. This results in
pcre2test exiting with an "Unexpected EOF, pcre2test run abandoned"
error. Either the Ctrl-Z character will need to be removed, or the file-
read logic in extend_inputline() modified so that Ctrl-Z is handled as
any other byte.

* The following JIT tests failed:

    8 bit: Test should not match: [644] 'a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaa' @ 'aaaaaaaaaaaaaaaaaaaaaaa'
    16 bit: Test should not match: [644] 'a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaa' @ 'aaaaaaaaaaaaaaaaaaaaaaa'
    32 bit: Test should not match: [644] 'a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?a?aaaaaaaaaaaaaaaaaaaaaaa' @ 'aaaaaaaaaaaaaaaaaaaaaaa'
    8 bit: Test should not match: [647] '(?:a*)*b' @ 'aaaaaaaaaaaaaaaaaaaaaaaa b'
    16 bit: Test should not match: [647] '(?:a*)*b' @ 'aaaaaaaaaaaaaaaaaaaaaaaa b'
    32 bit: Test should not match: [647] '(?:a*)*b' @ 'aaaaaaaaaaaaaaaaaaaaaaaa b'
    8 bit: Test should not match: [648] '(?:a*?)*?b' @ 'aaaaaaaaaaaaaaaaaaaaaaaa b'
    16 bit: Test should not match: [648] '(?:a*?)*?b' @ 'aaaaaaaaaaaaaaaaaaaaaaaa b'
    32 bit: Test should not match: [648] '(?:a*?)*?b' @ 'aaaaaaaaaaaaaaaaaaaaaaaa b'


I couldn't make heads or tails of the code, and so am hoping someone
here may recognize what's going on.


--Daniel


P.S.: Please Cc: me on any replies, as I am not subscribed to this list.


--
Daniel Richard G. || skunk@???
My ASCII-art .sig got a bad case of Times New Roman.
Index: RunTest.bat
===================================================================
--- RunTest.bat    (revision 929)
+++ RunTest.bat    (working copy)
@@ -263,7 +263,7 @@
   set failed="yes"
   goto :eof
 ) else if [%1]==[2] (
-  %pcre2test% %mode% %4 %5 %6 %7 %8 %9 -error -63,-62,-2,-1,0,100,188,189,190,191 >>%2%bits%\%testoutput%
+  %pcre2test% %mode% %4 %5 %6 %7 %8 %9 -error -65,-62,-2,-1,0,100,101,191,200 >>%2%bits%\%testoutput%
 )
 
 set type=
Index: src/pcre2_dfa_match.c
===================================================================
--- src/pcre2_dfa_match.c    (revision 929)
+++ src/pcre2_dfa_match.c    (working copy)
@@ -2590,7 +2590,7 @@
         PCRE2_SPTR endasscode = code + GET(code, 1);
         PCRE2_SIZE local_offsets[2];
         int rc;
-        int local_workspace[1000];
+        int local_workspace[200];


         while (*endasscode == OP_ALT) endasscode += GET(endasscode, 1);


@@ -2615,8 +2615,8 @@
       case OP_COND:
       case OP_SCOND:
         {
-        PCRE2_SIZE local_offsets[1000];
-        int local_workspace[1000];
+        PCRE2_SIZE local_offsets[200];
+        int local_workspace[200];
         int codelink = (int)GET(code, 1);
         PCRE2_UCHAR condcode;


@@ -2703,8 +2703,8 @@
       case OP_RECURSE:
         {
         dfa_recursion_info *ri;
-        PCRE2_SIZE local_offsets[1000];
-        int local_workspace[1000];
+        PCRE2_SIZE local_offsets[200];
+        int local_workspace[200];
         PCRE2_SPTR callpat = start_code + GET(code, 1);
         uint32_t recno = (callpat == mb->start_code)? 0 :
           GET2(callpat, 1 + LINK_SIZE);
@@ -2799,7 +2799,7 @@
         for (matched_count = 0;; matched_count++)
           {
           PCRE2_SIZE local_offsets[2];
-          int local_workspace[1000];
+          int local_workspace[200];


           int rc = internal_dfa_match(
             mb,                                   /* fixed match data */
@@ -2870,7 +2870,7 @@
       case OP_ONCE:
         {
         PCRE2_SIZE local_offsets[2];
-        int local_workspace[1000];
+        int local_workspace[200];


         int rc = internal_dfa_match(
           mb,                                   /* fixed match data */
Index: src/pcre2grep.c
===================================================================
--- src/pcre2grep.c    (revision 929)
+++ src/pcre2grep.c    (working copy)
@@ -96,6 +96,14 @@
 #define PCRE2_CODE_UNIT_WIDTH 8
 #include "pcre2.h"


+/* Older versions of MSVC lack snprintf(). This define allows for
+warning/error-free compilation and testing with MSVC compilers back to at least
+MSVC 10/2010. Except for VC6 (which is missing some fundamentals and fails). */
+
+#if defined(_MSC_VER) && (_MSC_VER < 1900)
+#define snprintf _snprintf
+#endif
+
#define FALSE 0
#define TRUE 1

@@ -3663,6 +3671,7 @@
         {
         char buff1[24];
         char buff2[24];
+        int ret;


         int baselen = (int)(opbra - op->long_name);
         int fulllen = (int)(strchr(op->long_name, ')') - op->long_name + 1);
@@ -3669,10 +3678,11 @@
         int arglen = (argequals == NULL || equals == NULL)?
           (int)strlen(arg) : (int)(argequals - arg);


-        if (snprintf(buff1, sizeof(buff1), "%.*s", baselen, op->long_name) >
-              (int)sizeof(buff1) ||
-            snprintf(buff2, sizeof(buff2), "%s%.*s", buff1,
-              fulllen - baselen - 2, opbra + 1) > (int)sizeof(buff2))
+        if ((ret = snprintf(buff1, sizeof(buff1), "%.*s", baselen, op->long_name),
+             ret < 0 || ret > (int)sizeof(buff1)) ||
+            (ret = snprintf(buff2, sizeof(buff2), "%s%.*s", buff1,
+                     fulllen - baselen - 2, opbra + 1),
+             ret < 0 || ret > (int)sizeof(buff2)))
           {
           fprintf(stderr, "pcre2grep: Buffer overflow when parsing %s option\n",
             op->long_name);