[Pcre-svn] [657] code/trunk: Fix pcre_study() bug with \b at…

トップ ページ
このメッセージを削除
著者: Subversion repository
日付:  
To: pcre-svn
題目: [Pcre-svn] [657] code/trunk: Fix pcre_study() bug with \b at start of branch.
Revision: 657
          http://vcs.pcre.org/viewvc?view=rev&revision=657
Author:   ph10
Date:     2011-08-15 18:39:09 +0100 (Mon, 15 Aug 2011)


Log Message:
-----------
Fix pcre_study() bug with \b at start of branch.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/pcre_study.c
    code/trunk/testdata/testinput2
    code/trunk/testdata/testoutput2


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2011-08-13 12:27:51 UTC (rev 656)
+++ code/trunk/ChangeLog    2011-08-15 17:39:09 UTC (rev 657)
@@ -245,6 +245,9 @@
 47. The pattern /f.*/8s, when applied to "for" with PCRE_PARTIAL_HARD, gave a
     complete match instead of a partial match. This bug was dependent on both
     the PCRE_UTF8 and PCRE_DOTALL options being set.
+    
+48. For a pattern such as /\babc|\bdef/ pcre_study() was failing to set up the
+    starting byte set, because \b was not being ignored. 



Version 8.12 15-Jan-2011

Modified: code/trunk/pcre_study.c
===================================================================
--- code/trunk/pcre_study.c    2011-08-13 12:27:51 UTC (rev 656)
+++ code/trunk/pcre_study.c    2011-08-15 17:39:09 UTC (rev 657)
@@ -773,7 +773,6 @@
       case OP_NOTUPTOI:
       case OP_NOT_HSPACE:
       case OP_NOT_VSPACE:
-      case OP_NOT_WORD_BOUNDARY:
       case OP_NRREF:
       case OP_PROP:
       case OP_PRUNE:
@@ -791,10 +790,16 @@
       case OP_SOM:
       case OP_THEN:
       case OP_THEN_ARG:
-      case OP_WORD_BOUNDARY:
       case OP_XCLASS:
       return SSB_FAIL;


+      /* We can ignore word boundary tests. */
+       
+      case OP_WORD_BOUNDARY:
+      case OP_NOT_WORD_BOUNDARY:
+      tcode++; 
+      break; 
+
       /* If we hit a bracket or a positive lookahead assertion, recurse to set
       bits from within the subpattern. If it can't find anything, we have to
       give up. If it finds some mandatory character(s), we are done for this


Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2011-08-13 12:27:51 UTC (rev 656)
+++ code/trunk/testdata/testinput2    2011-08-15 17:39:09 UTC (rev 657)
@@ -3837,4 +3837,8 @@
 /.(*F)/
     \P\Pabc


+/\btype\b\W*?\btext\b\W*?\bjavascript\b/IS
+
+/\btype\b\W*?\btext\b\W*?\bjavascript\b|\burl\b\W*?\bshell:|<input\b.*?\btype\b\W*?\bimage\b|\bonkeyup\b\W*?\=/IS
+
/-- End of testinput2 --/

Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2011-08-13 12:27:51 UTC (rev 656)
+++ code/trunk/testdata/testoutput2    2011-08-15 17:39:09 UTC (rev 657)
@@ -12217,4 +12217,20 @@
     \P\Pabc
 No match


+/\btype\b\W*?\btext\b\W*?\bjavascript\b/IS
+Capturing subpattern count = 0
+No options
+First char = 't'
+Need char = 't'
+Subject length lower bound = 18
+No set of starting bytes
+
+/\btype\b\W*?\btext\b\W*?\bjavascript\b|\burl\b\W*?\bshell:|<input\b.*?\btype\b\W*?\bimage\b|\bonkeyup\b\W*?\=/IS
+Capturing subpattern count = 0
+No options
+No first char
+No need char
+Subject length lower bound = 8
+Starting byte set: < o t u
+
/-- End of testinput2 --/