Re: [Exim] Protecting Percent-Hack exploitable machines with exim

Author: Philip Hazel
Date:
To: Phil Pennock
CC: exim-users
Subject: Re: [Exim] Protecting Percent-Hack exploitable machines with exim

On Tue, 21 Aug 2001, Phil Pennock wrote:

> Fair enough. In any case, the point of the regexp itself being
> inefficient stands -- ^.* is never good.

Ahem.

Well, it depends on what you are using the regex for. If you want to
test "is there a % in this string?", for example, then using either
^.*% or ^.*?% is equally good. The first whips through to the end of
the string, and then goes backwards character by character, looking for
%. The second goes forwards through the string character by character,
looking for %. Either way you look at each character until you hit a %
or have scanned the whole string. I don't see any way of avoiding this
work.

And there is a "standard" case where ^.* is actually very good. It is
when you are asking the question "does this string end with XXX?" where
XXX is a fixed string. The pattern ^.*(?<=XXX) is much faster than
the pattern XXX$ and gets better still with increasing length of
XXX. That's because the first pattern goes to the end, and then does one
check for XXX which either succeeds or fails. The second pattern looks
for XXX at any point in the string, then checks to see if it's at the
end. So it's doing more work churning through the string (and maybe
hitting partial false matches along the way).

If you don't use a lookbehind, ^.*XXX$ is no better than XXX$
because it also looks right along the string.

Philip

-- 
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.

This message is part of the following thread:
	the complete thread tree sorted by date
	Philip Hazel at
	Marc Haber at

Re: [Exim] Protecting Percent-Hack exploitable machines with…