Re: [Exim] Spamassassin at SMTP time with local

Autor: Marc MERLIN
Data:
Dla: Matthew Byng-Maddick
CC: exim-users
Temat: Re: [Exim] Spamassassin at SMTP time with local_scan

On Thu, Apr 18, 2002 at 10:07:26AM +0100, Matthew Byng-Maddick wrote:
> > Running SA at SMTP time shoould be ok since I've never seen it take more
> > than 5 sec (maybe 10) on my systems.
>
> My point is that this is not "seeking to minimize", which, as you'll notice

It's not, you are right.
You know what? I'm ok with "bending" this RFC.

> > Sure, RFC 1047 says:
> [stuff about accepting mail to queue it, rather than spawn a delivery process
> straight away]
> > but this was in days were we weren't getting all the crap we get nowadays.
>
> So you don't accept mail you can't deliver straight away? In fact, what
> you'll notice is that modern MTAs all do that.

Not true, mail that can't be delivered right away gets queued on my systems,
like it should be.
My point is that I'm ok with delaying a response to DATA by 5 to 10 secs max
(really 3 or so in general) even if it violates the intent of this RFC, RFC
that was written ages ago when we didn't recieve all the crap we receive
nowadays.

> > Is there an RFC that says that you MUST or SHOULD acknowledge DATA within
> > X seconds?
>
> No. It says (as I quoted) that you MUST seek to minimise the time in
> processing, while at the same time making sure it's safely written to disk.

You are correct.

> As I say, the problem is not in terms of the time taken at your mail system,
> but the overcongestion of other bits. This is why the timeout has been
> increased. However, a minute is probably reasonable. If OTOH, you were saying
> "I know I'm always around 5 minutes to do this" then it would be a different
> story, IMHO.

Agreed, While I'm ok with delaying the ACK by 3-4 secs on average, I
wouldn't be ok if those times were to reach one minute or more.

> Sure, however, I have filed a bug against SA that it appears that it
> IMO incorrectly flags headers as malformed, when it should say "Unusual but
> correct formation". I'd say that a receiving MTA has business doing checks
> on the headers of the message but not on the body (for spam). I feel that
> the latter is somewhat of a layer violation, though this ends up being

I'll agree with that, but I think I'm ok with bending that rule.

> > - forking spamc takes milliseconds, running the spamd checks takes several
> > seconds. It seemed obvious that trying to save on the fork by making
> > lcoal_scan significantly more complex didn't seem worth it.
>
> erm. fork() is a very quick call, *however* exec() is the most complicated
> system call that most UNIX variants have to do, consider: first they have to

You're right, I meant fork/exec, and yes it may be more than 1-2
milliseconds, but you'd be surprised how fast it is on recent systems.

> work out the type of the binary, then they have to throw away the current
> memory pages, then they have to map the text of the interpreter (which may
> be the dynamic loader) into those pages, and the text of the binary itself.
> as well as all sorts of other random stuff.

True, but that's still probably 100 to 1000 times faster than what spamd
does afterwards, which is why I decided it was negligeable.

That does not mean that when it's all working and I'm happy with the result,
the spamc network code can't be moved into it.

> the load can be helped somewhat by making spamc statically linked, and as
> small as possible, but be careful with it.

That's a good suggestion, I'll build a static spamc. Every little increase
in perf helps
(but if you are really worried about perf, you probably shouldn't be running
spamd in the first place :-D)

Marc
--
Microsoft is to operating systems & security ....
                                      .... what McDonalds is to gourmet cooking

Home page: http://marc.merlins.org/ | Finger marc_f@??? for PGP key