Hi,
that is strategical difference. You don't know the input from the pattern, and your input has no a-d characters. The interpreter only searches 'a', while jit searches two characters: 'a' and 'd' which distance is two. The latter is more complicated, but works better for random input. You can see the difference here:
./pcre2test -tm
PCRE2 version 10.34-RC1 2019-04-22
re> /abcd/
data> \[012345678a]{2000}
Match time 0.1659 milliseconds
No match
data>
re> /abcd/jit
data> \[012345678a]{2000}
Match time 0.0027 milliseconds
No match
Since there is no way to tell the input from the pattern, I prefer those optimizations, which perform better in more cases.
Forcing a single character search results more similar results:
./pcre2test -tm
PCRE2 version 10.34-RC1 2019-04-22
re> /abcd/
data> \[0123456780]{20000}
Match time 0.0097 milliseconds
No match
data>
re> /abcd/jit
data> \[0123456780]{20000}
Match time 0.0139 milliseconds
No match
data>
Optimizing the possessive dot is a good idea. I will do it. I will also check which changes they did on memchr().
Regards,
Zoltan
-------- Eredeti levél --------
Feladó: ND via Pcre-dev < pcre-dev@??? (Link ->
mailto:pcre-dev@exim.org) >
Dátum: 2019 május 26 20:34:18
Tárgy: [pcre-dev] JIT regression
Címzett: Pcre-dev@??? (Link ->
mailto:Pcre-dev@exim.org)
Good day!
Here is another pcre2test timings:
PCRE2 version 10.33 2019-04-16
/abcd/
\[0123456789]{200000}
Match time 0.1040 milliseconds
No match
/abcd/jit
\[0123456789]{200000}
Match time 1.6320 milliseconds
No match
It seems JIT is 16 times!! slower than interpreter for such simple pattern.
Please take a look at it.
Thanks.
--
## List details at
https://lists.exim.org/mailman/listinfo/pcre-dev