[Pcre-svn] [1544] code/trunk: Fix pcretest loop for \K in lo…

Inizio della pagina
Delete this message
Autore: Subversion repository
Data:  
To: pcre-svn
Oggetto: [Pcre-svn] [1544] code/trunk: Fix pcretest loop for \K in lookbehind assertion.
Revision: 1544
          http://vcs.pcre.org/viewvc?view=rev&revision=1544
Author:   ph10
Date:     2015-04-07 17:19:03 +0100 (Tue, 07 Apr 2015)


Log Message:
-----------
Fix pcretest loop for \K in lookbehind assertion.

Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/pcretest.c
    code/trunk/testdata/testinput2
    code/trunk/testdata/testinput5
    code/trunk/testdata/testoutput2
    code/trunk/testdata/testoutput5


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2015-04-07 15:52:11 UTC (rev 1543)
+++ code/trunk/ChangeLog    2015-04-07 16:19:03 UTC (rev 1544)
@@ -149,7 +149,9 @@
 36. The use of \K in a positive lookbehind assertion in a non-anchored pattern
     (e.g. /(?<=\Ka)/) could make pcregrep loop.


+37. There was a similar problem to 36 in pcretest for global matches.

+
Version 8.36 26-September-2014
------------------------------


Modified: code/trunk/pcretest.c
===================================================================
--- code/trunk/pcretest.c    2015-04-07 15:52:11 UTC (rev 1543)
+++ code/trunk/pcretest.c    2015-04-07 16:19:03 UTC (rev 1544)
@@ -5627,9 +5627,33 @@
         g_notempty = PCRE_NOTEMPTY_ATSTART | PCRE_ANCHORED;
         }


-      /* For /g, update the start offset, leaving the rest alone */
+      /* For /g, update the start offset, leaving the rest alone. There is a 
+      tricky case when \K is used in a positive lookbehind assertion. This can 
+      cause the end of the match to be less than or equal to the start offset. 
+      In this case we restart at one past the start offset. This may return the 
+      same match if the original start offset was bumped along during the 
+      match, but eventually the new start offset will hit the actual start 
+      offset. (In PCRE2 the true start offset is available, and this can be 
+      done better. It is not worth doing more than making sure we do not loop 
+      at this stage in the life of PCRE1.) */


-      if (do_g) start_offset = use_offsets[1];
+      if (do_g) 
+        {
+        if (g_notempty == 0 && use_offsets[1] <= start_offset)
+          {
+          if (start_offset >= len) break;  /* End of subject */ 
+          start_offset++;
+          if (use_utf)
+            {
+            while (start_offset < len)
+              {
+              if ((bptr[start_offset] & 0xc0) != 0x80) break;
+              start_offset++;
+              }
+            }
+          }  
+        else start_offset = use_offsets[1];
+        } 


       /* For /G, update the pointer and length */



Modified: code/trunk/testdata/testinput2
===================================================================
--- code/trunk/testdata/testinput2    2015-04-07 15:52:11 UTC (rev 1543)
+++ code/trunk/testdata/testinput2    2015-04-07 16:19:03 UTC (rev 1544)
@@ -4144,4 +4144,10 @@


"(?<=((?2))((?1)))"

+/(?<=\Ka)/g+
+    aaaaa
+
+/(?<=\Ka)/G+
+    aaaaa
+
 /-- End of testinput2 --/


Modified: code/trunk/testdata/testinput5
===================================================================
--- code/trunk/testdata/testinput5    2015-04-07 15:52:11 UTC (rev 1543)
+++ code/trunk/testdata/testinput5    2015-04-07 16:19:03 UTC (rev 1544)
@@ -794,4 +794,10 @@


/[^\xff]*PRUNE:\x{100}abc(xyz(?1))/8DZ

+/(?<=\K\x{17f})/8g+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+
+/(?<=\K\x{17f})/8G+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+
 /-- End of testinput5 --/


Modified: code/trunk/testdata/testoutput2
===================================================================
--- code/trunk/testdata/testoutput2    2015-04-07 15:52:11 UTC (rev 1543)
+++ code/trunk/testdata/testoutput2    2015-04-07 16:19:03 UTC (rev 1544)
@@ -14393,4 +14393,32 @@
 "(?<=((?2))((?1)))"
 Failed: lookbehind assertion is not fixed length at offset 17


+/(?<=\Ka)/g+
+    aaaaa
+ 0: a
+ 0+ aaaa
+ 0: a
+ 0+ aaaa
+ 0: a
+ 0+ aaa
+ 0: a
+ 0+ aa
+ 0: a
+ 0+ a
+ 0: a
+ 0+ 
+
+/(?<=\Ka)/G+
+    aaaaa
+ 0: a
+ 0+ aaaa
+ 0: a
+ 0+ aaa
+ 0: a
+ 0+ aa
+ 0: a
+ 0+ a
+ 0: a
+ 0+ 
+
 /-- End of testinput2 --/


Modified: code/trunk/testdata/testoutput5
===================================================================
--- code/trunk/testdata/testoutput5    2015-04-07 15:52:11 UTC (rev 1543)
+++ code/trunk/testdata/testoutput5    2015-04-07 16:19:03 UTC (rev 1544)
@@ -1916,4 +1916,32 @@
 No first char
 Need char = 'z'


+/(?<=\K\x{17f})/8g+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}
+ 0: \x{17f}
+ 0+ 
+
+/(?<=\K\x{17f})/8G+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}
+ 0: \x{17f}
+ 0+ 
+
 /-- End of testinput5 --/