------- You are receiving this mail because: -------
You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=632
--- Comment #5 from Philip Hazel <ph10@???> 2007-11-20 18:41:59 ---
On Tue, 20 Nov 2007, Sean Middleditch wrote:
> (1) The *ptr == 0 has to be replaced with ptr == end.
> (2) Results from GETCHARINC and friends are checked against 0, so this must be
> modified to check against ptr == end instead.
> (3) Some loops continuously look up the current character in a table, and check
> the return against 0, so these also need a ptr != end check.
> (4) Some parts of the code loop while *ptr == 'x' (where x is some character or
> another), so these need to be replaced with while ptr != end && *ptr == 'x'.
I wonder if and/or how much these changes will affect performance? Hard
to say until one has tried it, I suppose.
> Most of the functions just need to take a copy of the end pointer.
The *internal* functions should not need this if they have the "cd"
(compile data) variable passed to them, because cd->pattern_end has the
required value.
> I renamed pcre_compile2 to pcre_compile3 which takes a length used to
> calculate the end pointer, and made pcre_compile and pcre_compile2
> call into pcre_compile3 and call strlen to figure out the proper
> length to pass in.
That's exactly what I would have done.
> Do any of the unit tests include tests that have incomplete expressions or
> incomplete UTF8 characters that could cause the code to try to walk past the
> end of the string, or do I need to add some tests for that?
They may well do, but without a search of the input I can't be sure. It
might be as well to add some specific tests. It never does any harm.
Philip
--
Configure bugmail:
http://bugs.exim.org/userprefs.cgi?tab=email