On Tue, Apr 18, 2017 at 10:09 AM, Zoltán Herczeg <hzmester@???> wrote:
>>>>I.e. no difference in v1 & v2 anymore. The log case though shows pcre1
>>>>being 2% faster than pcre2:
>
> I don't know the reason then, needs more investigation. Btw I have started to improve the first character search optimization in JIT. It is still in progress (corner cases), although you can try it now.
I just tested against pcre1 & pcre2 trunk now with -O3 on both, and v2
is now ~20% faster than v1. I bisected it and it's because of your
recent SSE2 JIT optimizations. Thanks a lot.
The grep case now:
s/iter basic extended pcre1 fixed pcre2
basic 2.07 -- -0% -47% -50% -57%
extended 2.06 0% -- -47% -50% -57%
pcre1 1.10 88% 87% -- -6% -19%
fixed 1.04 99% 99% 6% -- -14%
pcre2 0.889 133% 132% 24% 17% --
And log:
s/iter extended basic pcre1 pcre2 fixed
extended 6.33 -- -0% -16% -17% -18%
basic 6.33 0% -- -16% -17% -18%
pcre1 5.29 20% 20% -- -0% -2%
pcre2 5.27 20% 20% 0% -- -2%
fixed 5.17 22% 22% 2% 2% --
>>...but wasn't allocating the stack for pcre2 like that, but only for
>>pcre1 (where I think it can't be avoided). Will do that & check how
>>that performs. Thanks again.
>
> The JIT stack should not affect performance, its purpose is extending the memory available for JIT.
But doesn't that impact performance, or is the alternate case that the
pattern merely runs out and either errors out or falls back to a
slower non-JIT path? The relevant docs don't say, or maybe I'm missed
something.
Also, both stack_create & match_data_create optionally take
pcre2_general_context_create(), if I don't want to use my own custom
malloc/free is this something that's worth doing? The docs just say
you can do it, but don't suggest if it's worth it if you're not using
your own memory allocation.