Re: [pcre-dev] Calling pcre_dfa_exec on expressions with ba…

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: Bogdan Harjoc
CC: pcre-dev
Subject: Re: [pcre-dev] Calling pcre_dfa_exec on expressions with backreferences
On Mon, 12 Dec 2011, Bogdan Harjoc wrote:

> Snort uses PCRE to filter traffic through about 3200 regexes, about 300 of
> which contain backreferences and possessive quantifiers. The rest of them
> (2900) could run faster through the alternative DFA algorithm.


There is a pcre_fullinfo facility for finding the number of capturing
parentheses in a pattern (check out PCRE_INFO_CAPTURECOUNT). If this
number is zero, you can be sure there are no back references (because
there's nothing that can be referenced).

> But since pcre_dfa_exec doesn't return an error when given a pcre object
> than uses these two unsupported features, callers have to search for them
> in the expression to decide which algorithm can be applied.


Are you sure about that? pcre_dfa_exec is supposed to return
PCRE_ERROR_DFA_UITEM if it encounters a backreference, or anything else
that it can't handle. If it does not give this error, there is a bug.

Possessive quantifiers are supported by pcre_dfa_exec.

I'm a bit surprised to learn the pcre_dfa_exec runs faster than
pcre_exec on any patterns; usually it is the other way round.
And as Zoltan says, using JIT is sure to be much faster than either of
them.

Philip

--
Philip Hazel