Re: [pcre-dev] I'm adding PCRE v2 support to Git. It's a bit…

Top Page
Delete this message
Author: Ævar Arnfjörð Bjarmason
Date:  
To: Zoltán Herczeg
CC: pcre-dev
Subject: Re: [pcre-dev] I'm adding PCRE v2 support to Git. It's a bit slower than v1
On Mon, Apr 10, 2017 at 4:13 PM, Zoltán Herczeg <hzmester@???> wrote:
> Hi Ævar,
>
> this is really awesome news! I am happy that you choose pcre for git.
>
>>I did some basic performance benchmarks between v1 and v2 of PCRE.
>>Depending on whether we use git-grep or git-log v2 is 1% to 10% slower
>>than v1 when both use JIT.
>
> I would like to see the compilation flags for both pcre1 and pcre2? By default the library compiles without optimization options which has the same effect as -O0 option. This could be changed by setting up a CFLAGS value.


I did some further performance testing and it turns out much of this
was just due to me not compiling with -O3. I pulled pcre & pcre2 from
latest svn, built them both with:

    CXXFLAGS=-O3 CFLAGS=-O3 ./configure --prefix=$PWD/inst --enable-jit


And then linked git against those respective libraries, also compiled
with -O3. I also fixed the bug you mentioned with not sharing
pcre2_match_data_create_from_pattern(), that's now constructed when
the pattern is compiled. Thanks!

I then patched git itself so it won't try to take a non-pcre path for
fixed patterns[1], and ran a basic grep benchmark[2]. This shows:

         s/iter extended    basic    pcre2    pcre1    fixed
extended   2.07       --      -0%     -49%     -50%     -50%
basic      2.07       0%       --     -49%     -49%     -50%
pcre2      1.05      97%      97%       --      -0%      -1%
pcre1      1.05      98%      98%       0%       --      -0%
fixed      1.04      99%      99%       1%       0%       --


I.e. no difference in v1 & v2 anymore. The log case though shows pcre1
being 2% faster than pcre2:

         s/iter    basic extended    pcre2    pcre1    fixed
basic      6.28       --      -0%     -16%     -17%     -18%
extended   6.27       0%       --     -16%     -17%     -18%
pcre2      5.28      19%      19%       --      -2%      -3%
pcre1      5.19      21%      21%       2%       --      -1%
fixed      5.13      23%      22%       3%       1%       --


Both these tests are on linux.git.

1.

 $ git diff -U1
diff --git a/grep.c b/grep.c
index aabbac2a5e..f432d0d3f2 100644
--- a/grep.c
+++ b/grep.c
@@ -610,3 +610,3 @@ static void compile_regexp(struct grep_pat *p,
struct grep_opt *opt)
         */
-       if (opt->fixed || is_fixed(p->pattern, p->patternlen))
+       if (opt->fixed /*|| is_fixed(p->pattern, p->patternlen)*/)
                p->fixed = !icase || ascii_only;



2.

LD_PRELOAD=/home/avar/g/pcre/inst/lib/libpcre.so:/home/avar/g/pcre2/inst/lib/libpcre2-8.so
PF=~/g/git/ perl -MBenchmark=cmpthese -wE 'cmpthese(50, { fixed => sub
{ system "$ENV{PF}git grep -F avarasu >/dev/null" }, basic => sub {
system "$ENV{PF}git grep -G avarasu >/dev/null" }, extended => sub {
system "$ENV{PF}git grep -E avarasu >/dev/null" }, pcre1 => sub {
system "$ENV{PF}git -c grep.patternType=pcre1 grep avarasu >/dev/null"
}, pcre2 => sub { system "$ENV{PF}git -c grep.patternType=pcre2 grep
avarasu >/dev/null" } })'

3. LD_PRELOAD=/home/avar/g/pcre/inst/lib/libpcre.so:/home/avar/g/pcre2/inst/lib/libpcre2-8.so
PF=~/g/git/ perl -MBenchmark=cmpthese -wE 'cmpthese(50, { fixed => sub
{ system "$ENV{PF}git log -F --grep=avarasu >/dev/null" }, basic =>
sub { system "$ENV{PF}git log --basic-regexp --grep=avarasu
>/dev/null" }, extended => sub { system "$ENV{PF}git log

--extended-regexp --grep=avarasu >/dev/null" }, pcre1 => sub { system
"$ENV{PF}git -c grep.patternType=pcre1 log --grep=avarasu >/dev/null"
}, pcre2 => sub { system "$ENV{PF}git -c grep.patternType=pcre2 log
--grep=avarasu >/dev/null" } })'