Re: [pcre-dev] JIT regression

Top Page
Delete this message
Author: Zoltán Herczeg
Date:  
To: Pcre-dev@exim.org, ND
Subject: Re: [pcre-dev] JIT regression
Hi,

that is strategical difference. You don't know the input from the pattern, and your input has no a-d characters. The interpreter only searches 'a', while jit searches two characters: 'a' and 'd' which distance is two. The latter is more complicated, but works better for random input. You can see the difference here:

./pcre2test -tm
PCRE2 version 10.34-RC1 2019-04-22
  re> /abcd/
data> \[012345678a]{2000}

Match time 0.1659 milliseconds
No match
data>
  re> /abcd/jit
data> \[012345678a]{2000}

Match time 0.0027 milliseconds
No match

Since there is no way to tell the input from the pattern, I prefer those optimizations, which perform better in more cases.

Forcing a single character search results more similar results:

./pcre2test -tm
PCRE2 version 10.34-RC1 2019-04-22
  re> /abcd/
data> \[0123456780]{20000}

Match time 0.0097 milliseconds
No match
data>
  re> /abcd/jit
data> \[0123456780]{20000}

Match time 0.0139 milliseconds
No match
data>

Optimizing the possessive dot is a good idea. I will do it. I will also check which changes they did on memchr().

Regards,
Zoltan


-------- Eredeti levél --------
Feladó: ND via Pcre-dev < pcre-dev@??? (Link -> mailto:pcre-dev@exim.org) >
Dátum: 2019 május 26 20:34:18
Tárgy: [pcre-dev] JIT regression
Címzett: Pcre-dev@??? (Link -> mailto:Pcre-dev@exim.org)
Good day!
Here is another pcre2test timings:
PCRE2 version 10.33 2019-04-16
/abcd/
\[0123456789]{200000}
Match time 0.1040 milliseconds
No match
/abcd/jit
\[0123456789]{200000}
Match time 1.6320 milliseconds
No match
It seems JIT is 16 times!! slower than interpreter for such simple pattern.
Please take a look at it.
Thanks.
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev