[pcre-dev] \s*\R regression

Top Page
Delete this message
Author: Ralf Junker
Date:  
To: pcre-dev@exim.org
Subject: [pcre-dev] \s*\R regression
Starting with PCRE 8.10, there is a \s*\R regression compared to PCRE 8.02.

Up to PCRE 8.02, the following patterns matched, but they no longer match starting with PCRE 8.10:

/\s*\R/
\x20\x0a
\x20\x0d
\x20\x0d\x0a

Running current SVN 959 pcretest also reports a memory overrun warning for each match:

Pointer arithmetic overrun in process: pcretest.exe(5896)  - pcre_exec.c#6957
  0x01E71960+1, that is at offset 50000+1 in heap block 0x01E65610 which is only 50000 bytes long.
    0x00441047 - pcre_exec.c#6957
    0x004785B6 - pcretest.c#4017
    0x0048667A
  The memory block (0x01E65610) [size: 50000 bytes] was allocated with malloc
    0x004742B5 - pcretest.c#2239
    0x0048667A


Interestingly, current SVN 959 matches all patterns if either \s or \R are parenthesized:

/(\s)*\R/
\x20\x0a
\x20\x0d
\x20\x0d\x0a

/\s*(\R)/
\x20\x0a
\x20\x0d
\x20\x0d\x0a

I believe the regression was caused by SVN 528, which changed pcre_compile.c from

case OP_WHITESPACE:
return next == -ESC_S || next == -ESC_d || next == -ESC_w;

to

case OP_WHITESPACE:
return next == -ESC_S || next == -ESC_d || next == -ESC_w || next == -ESC_R;

Removing the added "|| next == -ESC_R" clears the regression and fixes the memory overrun for me. Is passes my own test suite, but I did not run the PCRE tests. Unfortunately, SVN 528 does not include any tests so there might be untested side effects involved.

Ralf