Revision: 552
http://vcs.pcre.org/viewvc?view=rev&revision=552
Author: ph10
Date: 2010-10-13 11:15:41 +0100 (Wed, 13 Oct 2010)
Log Message:
-----------
Fix \s bug in character classes (always removing VT).
Modified Paths:
--------------
code/trunk/ChangeLog
code/trunk/pcre_compile.c
code/trunk/testdata/testinput1
code/trunk/testdata/testoutput1
Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog 2010-10-10 17:33:07 UTC (rev 551)
+++ code/trunk/ChangeLog 2010-10-13 10:15:41 UTC (rev 552)
@@ -16,6 +16,11 @@
result in overall failure. Similarly, (*COMMIT) now overrides (*PRUNE) and
(*SKIP), (*SKIP) overrides (*PRUNE) and (*THEN), and (*PRUNE) overrides
(*THEN).
+
+3. If \s appeared in a character class, it removed the VT character from
+ the class, even if it had been included by some previous item, for example
+ in [\x00-\xff\s]. (This was a bug related to the fact that VT is not part
+ of \s, but is part of the POSIX "space" class.)
Version 8.10 25-Jun-2010
Modified: code/trunk/pcre_compile.c
===================================================================
--- code/trunk/pcre_compile.c 2010-10-10 17:33:07 UTC (rev 551)
+++ code/trunk/pcre_compile.c 2010-10-13 10:15:41 UTC (rev 552)
@@ -3503,9 +3503,14 @@
for (c = 0; c < 32; c++) classbits[c] |= ~cbits[c+cbit_word];
continue;
+ /* Perl 5.004 onwards omits VT from \s, but we must preserve it
+ if it was previously set by something earlier in the character
+ class. */
+
case ESC_s:
- for (c = 0; c < 32; c++) classbits[c] |= cbits[c+cbit_space];
- classbits[1] &= ~0x08; /* Perl 5.004 onwards omits VT from \s */
+ classbits[0] |= cbits[cbit_space];
+ classbits[1] |= cbits[cbit_space+1] & ~0x08;
+ for (c = 2; c < 32; c++) classbits[c] |= cbits[c+cbit_space];
continue;
case ESC_S:
Modified: code/trunk/testdata/testinput1
===================================================================
--- code/trunk/testdata/testinput1 2010-10-10 17:33:07 UTC (rev 551)
+++ code/trunk/testdata/testinput1 2010-10-13 10:15:41 UTC (rev 552)
@@ -4073,4 +4073,7 @@
** Failers
XABX
+/[\x00-\xff\s]+/
+ \x0a\x0b\x0c\x0d
+
/-- End of testinput1 --/
Modified: code/trunk/testdata/testoutput1
===================================================================
--- code/trunk/testdata/testoutput1 2010-10-10 17:33:07 UTC (rev 551)
+++ code/trunk/testdata/testoutput1 2010-10-13 10:15:41 UTC (rev 552)
@@ -6658,4 +6658,8 @@
XABX
No match
+/[\x00-\xff\s]+/
+ \x0a\x0b\x0c\x0d
+ 0: \x0a\x0b\x0c\x0d
+
/-- End of testinput1 --/