[pcre-dev] [Bug 1027] Faster keyword matching

Top Page
Delete this message
Author: Philip Hazel
Date:  
To: pcre-dev
Subject: [pcre-dev] [Bug 1027] Faster keyword matching
------- You are receiving this mail because: -------
You are on the CC list for the bug.

http://bugs.exim.org/show_bug.cgi?id=1027




--- Comment #3 from Philip Hazel <ph10@???> 2010-10-02 10:55:08 ---
On Fri, 1 Oct 2010, Alan Lehotsky wrote:

> Just thought this might be an interesting optimization to consider - but I
> agree that it's definitely an edge case, which is why I wasn't clamoring
> for you to implement this optimization, rather trying more to gauge the
> interest in adding this kind of capability to PCRE.


I haven't closed the issue; I do realize that there may be people who
"just use a regex because it's easy and they know how". I also know that
there are folks searching quite long strings.

I suspect that any code that is added would be of this form:

1. During study, determine that the regex consists only of literal
character matches (no repeats, no capturing, no parens, nada).

2. If so, do the A-C prep and modify the data structure, and set a flag.

3. When pcre_exec() is called, if the flag is set, obey entirely
different code. I guess the same for pcre_dfa_exec(); I imagine it would
be better there too (assuming one could implement the same interface).

A potential problem that I see with this is that it expands the size of
the code for only a few situations. However, how much this matters
depends of course on how big the additional code is.

Philip


--
Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email