[pcre-dev] [Bug 867] "\w" no longer functions

Top Page
Delete this message
Author: Mart Goodall
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 867] "\w" no longer functions
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=867




--- Comment #5 from Mart Goodall <mart.goodall@???> 2009-07-28 23:33:46 ---
-----Original Message-----
From: admin@??? [mailto:admin@bugs.exim.org] On Behalf Of Philip
Hazel
Sent: Tuesday, July 28, 2009 3:30 PM
To: Mart Goodall
Subject: [Bug 867] "\w" no longer functions



------- You are receiving this mail because: -------

You are on the CC list for the bug.

You reported the bug.



http://bugs.exim.org/show_bug.cgi?id=867









--- Comment #3 from Philip Hazel <ph10@???> 2009-07-28 20:29:30
---

On Tue, 28 Jul 2009, Mart Goodall wrote:



> I kept the compile as close to my production as possible. I have the same


> failure - see screen output below:-


>


> PCRE version @PCRE_MAJOR@.@PCRE_MINOR@@PCRE_PRERELEASE@ @PCRE_DATE@




...Something strange there ... why isn't it giving the real version

number? I see from the original report that you are using Windows. How

did you do the build?



> re> "[^\w\s. |-]"


> data> Maureen Hubbard


> 0: M


>


>


> PCRE version @PCRE_MAJOR@.@PCRE_MINOR@@PCRE_PRERELEASE@ @PCRE_DATE@


> Compiled with


> UTF-8 support


> Unicode properties support


> Newline sequence is LF


> \R matches all Unicode newlines


> Internal link size = 2


> POSIX malloc threshold = 10


> Default match limit = 10000000


> Default recursion depth limit = 10000000


> Match recursion uses stack


>


>


> Hope this helps




Not a lot, I'm afraid. On my Gentoo Linux box with exactly the same

compile time options, it works fine. So it's something to do with your

Windows version - I do not run Windows, so there is no way I can test

for myself.



I have three suggestions: (1) Try using autocallout to see what is going

on. This is what I get when I run pcretest (note the "C" option after

the pattern):



PCRE version 7.9 2009-04-11



re> "[^\w\s. |-]"C


data> Maureen Hubbard


--->Maureen Hubbard

 +0 ^                   [^\w\s. |-]


 +0  ^                  [^\w\s. |-]


 +0   ^                 [^\w\s. |-]


 +0    ^                [^\w\s. |-]


 +0     ^               [^\w\s. |-]


 +0      ^              [^\w\s. |-]


 +0       ^             [^\w\s. |-]


 +0        ^            [^\w\s. |-]


 +0         ^           [^\w\s. |-]


 +0          ^          [^\w\s. |-]


 +0           ^         [^\w\s. |-]


 +0            ^        [^\w\s. |-]


 +0             ^       [^\w\s. |-]


 +0              ^      [^\w\s. |-]


 +0               ^     [^\w\s. |-]


 +0                ^    [^\w\s. |-]


No match

data>





same test gives

PCRE version @PCRE_MAJOR@.@PCRE_MINOR@@PCRE_PRERELEASE@ @PCRE_DATE@



re> "[^\w\s. |-]"C


data> Maureen Hubbard


--->Maureen\x20Hubbard

 +0 ^                      [^\w\s. |-]


+11 ^^

0: M





This shows that it is testing every character, and they all fail.

Another option you can use is "D" to show the compiled code:



PCRE version 7.9 2009-04-11



re> "[^\w\s. |-]"D


------------------------------------------------------------------

0 36 Bra

  3     [\x00-\x08\x0b\x0e-\x1f!-,/:-@[-^`{}-\xff] (neg)


36 36 Ket

 39     End


------------------------------------------------------------------

Capturing subpattern count = 0

No options

No first char

No need char

data>





PCRE version @PCRE_MAJOR@.@PCRE_MINOR@@PCRE_PRERELEASE@ @PCRE_DATE@



re> "[^\w\s. |-]"D


------------------------------------------------------------------

0 36 Bra

  3     [\x00-\x1f\x21-\x2c\x2f-rt-vx-{}-\xff] (neg)


36 36 Ket

 39     End


------------------------------------------------------------------

Capturing subpattern count = 0

No options

No first char

No need char



If your output is the same as mine, things are really weird...



My final suggestion is that you edit pcre_internal.h, and change the "0"

in line 50 into "1" and then re-compile. This will insert debugging

statements into the code, and they will generate some output when you

run the test. This may perhaps give some clue as to what is going on.







PCRE version @PCRE_MAJOR@.@PCRE_MINOR@@PCRE_PRERELEASE@ @PCRE_DATE@



re> "[^\w\s. |-]"


------------------------------------------------------------------

[^\w\s. |-]

>> start branch


length=6 added 0 c=[

length=39 added 33 c=

>> end branch


end pre-compile: length=40 workspace=36

Length = 40 top_bracket = 0 top_backref = 0

Options=00000000

0 36 Bra

  3     [\x00-\x1f\x21-\x2c\x2f-rt-vx-{}-\xff] (neg)


36 36 Ket

 39     End


------------------------------------------------------------------

data> Maureen Hubbard


>>>> Match against: Maureen Hubbard


start non-capturing bracket

bracket 0 tail recursion

ims reset to 00

match() returned 1 from line 926 >>>> returning 1

0: M



Good luck

mart



Philip





--

Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email