Re: [Exim] High Load

Top Page
Delete this message
Reply to this message
Author: Philip Hazel
Date:  
To: Christopher Curtis
CC: Exim Users Mailing List
Subject: Re: [Exim] High Load
On Tue, 11 Dec 2001, Christopher Curtis wrote:

> If message_body matches ... regex is:
>    "name=\".*\\..*\\.(ad[ep]|ba[st]|chm|cmd|com|cpl|crt|eml|exe|hlp|hta|in[fs]|isp|jse?|lnk|md[be]|ms[cipt]|pcd|pif|reg|scr|sct|shs|url|vb[se]|ws[fhc])\""
> ... idea is to pick up *.*.foo (double extension executables)


name=".*\..*\.(ad[ep]|ba[st]|chm|cmd|com|cpl|crt|eml|exe|hlp|hta|in[fs]|isp|
jse?|lnk|md[be]|ms[cipt]|pcd|pif|reg|scr|sct|shs|url|vb[se]|ws[fhc])"

Hmm. That isn't a "classic" high-load regex (nested unlimited repeats),
but it could well be something similar with the double .*. I can believe
that that regex might take a long time to yield "no" on 5000 characters
if there is at least one occurrence of name=" in the string, especially
if it is near the beginning.

We have the technology. I can test this using pcretest. On a 500 (sic)
character data string not containing name=" it takes 0.018ms to yield "no
match". However, if I put name=" at the start, it takes 0.220ms. That's a
factor of 10. Hmm. On a 5000 character line it takes 2.390ms. That's not
enormously long. Truly "pathological" regular expressions run for hours
and hours.

Nevertheless, I suspect that the delay is somehow related to what you
are doing in the filter. I guess the way to narrow it down would be to
take one of the messages that causes the delay, and feed it in again (in
testing mode) using exim -d11 -bh so that you can see where in the
filter interpretation the delay is occurring.

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.