[Pcre-svn] [303] code/trunk: Memchr() speed-up for unanchore…

Top Page
Delete this message
Author: Subversion repository
Date:  
To: pcre-svn
Subject: [Pcre-svn] [303] code/trunk: Memchr() speed-up for unanchored pattern in 8-bit mode.
Revision: 303
          http://www.exim.org/viewvc/pcre2?view=rev&revision=303
Author:   ph10
Date:     2015-07-06 17:05:41 +0100 (Mon, 06 Jul 2015)
Log Message:
-----------
Memchr() speed-up for unanchored pattern in 8-bit mode.


Modified Paths:
--------------
    code/trunk/ChangeLog
    code/trunk/configure.ac
    code/trunk/src/pcre2_match.c


Modified: code/trunk/ChangeLog
===================================================================
--- code/trunk/ChangeLog    2015-07-03 07:04:45 UTC (rev 302)
+++ code/trunk/ChangeLog    2015-07-06 16:05:41 UTC (rev 303)
@@ -6,7 +6,10 @@


1. Improve matching speed of patterns starting with + or * in JIT.

+2. Use memchr() to find the first character in an unanchored match in 8-bit
+mode in the interpreter. This gives a significant speed improvement.

+
Version 10.20 30-June-2015
--------------------------


Modified: code/trunk/configure.ac
===================================================================
--- code/trunk/configure.ac    2015-07-03 07:04:45 UTC (rev 302)
+++ code/trunk/configure.ac    2015-07-06 16:05:41 UTC (rev 303)
@@ -9,9 +9,9 @@
 dnl be defined as -RC2, for example. For real releases, it should be empty.


m4_define(pcre2_major, [10])
-m4_define(pcre2_minor, [20])
-m4_define(pcre2_prerelease, [])
-m4_define(pcre2_date, [2015-06-30])
+m4_define(pcre2_minor, [21])
+m4_define(pcre2_prerelease, [-RC1])
+m4_define(pcre2_date, [2015-07-06])

# NOTE: The CMakeLists.txt file searches for the above variables in the first
# 50 lines of this file. Please update that if the variables above are moved.

Modified: code/trunk/src/pcre2_match.c
===================================================================
--- code/trunk/src/pcre2_match.c    2015-07-03 07:04:45 UTC (rev 302)
+++ code/trunk/src/pcre2_match.c    2015-07-06 16:05:41 UTC (rev 303)
@@ -6783,7 +6783,8 @@
       end_subject = t;
       }


-    /* Advance to a unique first code unit if there is one. */
+    /* Advance to a unique first code unit if there is one. In 8-bit mode, the 
+    use of memchr() gives a big speed up. */


     if (has_first_cu)
       {
@@ -6793,8 +6794,15 @@
           (smc = UCHAR21TEST(start_match)) != first_cu && smc != first_cu2)
           start_match++;
       else
+        {
+#if PCRE2_CODE_UNIT_WIDTH != 8
         while (start_match < end_subject && UCHAR21TEST(start_match) != first_cu)
           start_match++;
+#else
+        start_match = memchr(start_match, first_cu, end_subject - start_match);
+        if (start_match == NULL) start_match = end_subject;
+#endif          
+        }   
       }


     /* Or to just after a linebreak for a multiline match */