[pcre-dev] [Bug 867] "\w" no longer functions

Kezdőlap
Üzenet törlése
Szerző: Philip Hazel
Dátum:  
Címzett: pcre-dev
Tárgy: [pcre-dev] [Bug 867] "\w" no longer functions
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=867




--- Comment #3 from Philip Hazel <ph10@???> 2009-07-28 20:29:30 ---
On Tue, 28 Jul 2009, Mart Goodall wrote:

> I kept the compile as close to my production as possible. I have the same
> failure - see screen output below:-
>
> PCRE version @PCRE_MAJOR@.@PCRE_MINOR@@PCRE_PRERELEASE@ @PCRE_DATE@


...Something strange there ... why isn't it giving the real version
number? I see from the original report that you are using Windows. How
did you do the build?

> re> "[^\w\s. |-]"
> data> Maureen Hubbard
> 0: M
>
>
> PCRE version @PCRE_MAJOR@.@PCRE_MINOR@@PCRE_PRERELEASE@ @PCRE_DATE@
> Compiled with
> UTF-8 support
> Unicode properties support
> Newline sequence is LF
> \R matches all Unicode newlines
> Internal link size = 2
> POSIX malloc threshold = 10
> Default match limit = 10000000
> Default recursion depth limit = 10000000
> Match recursion uses stack
>
>
> Hope this helps


Not a lot, I'm afraid. On my Gentoo Linux box with exactly the same
compile time options, it works fine. So it's something to do with your
Windows version - I do not run Windows, so there is no way I can test
for myself.

I have three suggestions: (1) Try using autocallout to see what is going
on. This is what I get when I run pcretest (note the "C" option after
the pattern):

PCRE version 7.9 2009-04-11

re> "[^\w\s. |-]"C
data> Maureen Hubbard

--->Maureen Hubbard
 +0 ^                   [^\w\s. |-]
 +0  ^                  [^\w\s. |-]
 +0   ^                 [^\w\s. |-]
 +0    ^                [^\w\s. |-]
 +0     ^               [^\w\s. |-]
 +0      ^              [^\w\s. |-]
 +0       ^             [^\w\s. |-]
 +0        ^            [^\w\s. |-]
 +0         ^           [^\w\s. |-]
 +0          ^          [^\w\s. |-]
 +0           ^         [^\w\s. |-]
 +0            ^        [^\w\s. |-]
 +0             ^       [^\w\s. |-]
 +0              ^      [^\w\s. |-]
 +0               ^     [^\w\s. |-]
 +0                ^    [^\w\s. |-]
No match

data>


This shows that it is testing every character, and they all fail.
Another option you can use is "D" to show the compiled code:

PCRE version 7.9 2009-04-11

re> "[^\w\s. |-]"D

------------------------------------------------------------------
  0  36 Bra
  3     [\x00-\x08\x0b\x0e-\x1f!-,/:-@[-^`{}-\xff] (neg)
 36  36 Ket
 39     End
------------------------------------------------------------------
Capturing subpattern count = 0
No options
No first char
No need char

data>


If your output is the same as mine, things are really weird...

My final suggestion is that you edit pcre_internal.h, and change the "0"
in line 50 into "1" and then re-compile. This will insert debugging
statements into the code, and they will generate some output when you
run the test. This may perhaps give some clue as to what is going on.

Philip


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email