[pcre-dev] Question regarding regex complexity, catastrophi…

Pàgina inicial
Delete this message
Autor: Kilian Kilger
Data:  
A: pcre-dev
Assumpte: [pcre-dev] Question regarding regex complexity, catastrophic backtrack and jit/no_jit
Dear PCRE developers,

I have a question regarding "catastrophic backtracking" in the
comparison of jit/non-jit matching.

Matching with jit, it was very easy to produce an example which
exceeds the available resources: We take the pattern
"(*LIMIT_MATCH=10)(x+x+x+x+)+y" and as subject we take a string of
length 100000 containing only the letter "x".

It was however surprisingly difficult to get the normal non-jit
matcher pcre2_match to run out of resources. With the "y" at the end
of the pattern, we never got any resource exhaustion in the non-jit
case.

At the end we only succeeded by taking a very long pattern which
consisted of "(*LIMIT_HEAP=1)(*LIMIT_DEPTH=1)(*LIMIT_MATCH=1)" and
then 10000 times the string "x+z?". Even then, appending the "y" at
the end makes the match run very fast.

Why is this so different between jit/non-jit. What surprising
optimizations are done in the non-jit-case?

Thank you and best regards,
Kilian.