On 3 Oct 2002, Nigel Metheringham wrote:
> On Thu, 2002-10-03 at 14:21, Hugh Sasse Staff Elec Eng wrote:
> > Changing (\\\S+\\\.( to (.+\\\.( in the unquoted matcher seems to
> > trap this header for me.
>
> It will also seriously false positive - remember that new lines are
> translated into spaces in the filter body variables.
After having similar MIME transit past our system filter we did change the
regexp match to trap spaces as well. In fact we changed it to match all
characters which are legal in FAT/NTFS file systems (or actually all
characters which are _not_ illegal):
Change: \\\\S+
To: [^\":/\\\\\\\\?<>|]{1,253}
in the unquoted matches.
We were aware that the false +ve rate might increase, but in fact having
used this for almost 5 months now we've had no false positives - the key
point being that we limit the matches to 253 characters, the maximum
filename length under FAT/NTFS [*]. (We handle about
25 000 emails a week in our domain, just to put that in context.)
>
> Basically you just can't parse MIME information by regular expressions,
> especially when the raw data you are working with has been corrupted
> previously. Thats why I gave up on the filter.
>
For that reason we're going to move to using the facilities of exiscan
quite soon. It will also enable us to reject active contact at SMTP time
(much better for the new worms which fake their sender envelope
addresses). However, the system filter still provides very useful
protection against active content at very little cost. It has been a real
help in the last couple of years!
Graeme
[*] Actually 255 characters, but the ".extension" has to be matched too.
I know that extensions are always 3 characters, but I wasn't sure if this
would always be necessarily so (and couldn't be bothered finding out).
--------------------------------------------------------------------
Dr Graeme Stewart http://www.astro.gla.ac.uk/users/graeme/
Department of Physics and Astronomy, University of Glasgow, Scotland