Author: Ralf Junker Date: To: pcre-dev Subject: Re: [pcre-dev] 8.34-RC1: \s is locale dependent whereas [\s] is not
On 25.11.2013 15:41, ph10@??? wrote:
> If you did specify the locale at compile time, the tables should be
> remembered and used at runtime. In this case, there is something that
> needs to be investigated.
I use the same locale at compile time and run time. More precisely, I am
using a modified pcre_chartables.c file.
The problem surface when I tested the PCRE 8.34-RC1 without updating
pcre_chartables.c. The old file does not define VT as white space and
with this, PCRE gives different results for [\s] and \s which I believed
it should not.
On further investigation, it turned out that updating pcre_chartables.c
solved the issue.
It seems that for [\s], white space are determined by cbits in
pcre_compile.c line 5030.
For \s, the white space is determined by md->ctypes in pcre_exec.c line
4815.
I do not fully understand how these variables are filled, but they
contain different values for VT.
Btw., PCRE_EXTENDED seems to be affected by the problem as well: Unless
VT is defined as white space in pcre_chartables.c, VT is not removed
from the pattern in extended mode.
The pcre_chartables.c which does NOT define VT as white space is attached.