>My reason for writing this is to facilitate collaboration to design a
>"perfect" local_scan interface. Philip has the final say regarding
>what will or will not be included in the official exim release, of
>course. I want to help make the interface meet my desires, and also
>to help implement it (now :-)) as I am able.
>
>Please contribute your $0.02 (in whatever monetary unit you
>deem most appropriate).
>The Problem :
I agree with the problem statement & goals for this discussion.
>I think some basic premises must be agreed upon :
I agree with Derrick; and, I would add a premise, which is:
The exposed interface should be "robust," in the sense that it should
support as much flexibility in scanner function as possible, while (more
importantly) protecting Exim itself from being corrupted or damaged by
faulty scanner operation.
To this end, I would recommend an approach like Derrick's "pipe" idea.
It is well-understood (stdin, stout), and can evolve without modifying
the underlying communication method (e.g. sort of like the way HELO
evolved into EHLO). If people want to add extra functionality to the
interface, (e.g. TCP/IP to a remote machine), they can use a separate
package (e.g. Stunnel) to provide it. That is, using a pipe, I think,
maintains the maximum flexibility for scanner designers and users, with
a small footprint as well.
> Cons:
> o Requires creating a new protocol.
> (or beating an existing one (maybe LMTP or BSMTP) into a
> shape suitable for this use)
> o Could be complex.
First, I want to say that, with the rising tide of spam and viruses in
the world, creating such a protocol is a VERY GOOD IDEA, which will
likely have wide appeal. It could be the enabling technology to make
spam and virus filtering dramatically more common. Good Standards are
very important things, and I commend this group for having the
discussion.
I think we can address the first issue by leveraging the existing
local_scan "protocol." Also, Derrick has a good start on the types of
options required:
> Additional Data:
> One idea I had for this is using a pipe. exim would open a
> pipe to the specified scanner program. The message would be
> passed to the scanner on stdin. The exit code from the
> scanner program would determine what exim should do with it --
> accept, tempreject, permreject. If the scanner rejects the
> message, its output would be the message to return to the
> other server. Otherwise its output would specify how the
> message should be modified (namely adding or modifying
> headers). The format of the header modification text is a
> detail that can be worked out later.
As for Exim's output to the scanner, I suggest adding a single text
string, to be specified in the Exim config file (send just prior to
sending the actual email message). Wonderful would be the capability to
specify a "general" string that can be expanded to include the contents
of available Exim variables (at the time local_scan is called, of
course). This string should not be global in scope, but specific to the
instance of local_scan called (and any expansion done immediately prior
to calling local_scan). This string would be scanner-unique, and could
be used for any purpose devised by the scanner creator. One could then
even pass multiple commands, options, etc. by designing the scanner to
recognize a separator string/character. A null string would be
permissible (and in fact should be the default). With such a scheme,
one could even create a "parent" local_scan function that could parse
the options string, and call any number of different scanners, and feed
the results back into Exim.
To harp the point, with the options string, one could call a virus
scanner with one call to local_scan, and then call SpamAssassin with a
another call to local_scan. Imagine:
require=local_scan("exim-sa;$domain;$local_part")
deny=local_scan("antivirus;killall;log")
I didn't think about the contents of the options string, just tossing a
wild example out there.
I am thinking that it should behave as much like existing ACL commands
(tests) as possible, and be callable from any ACL. This would maintain
the "flavor" of Exim, and might be easiest to incorporate in the code.
The only real problem I see, with such a pipe scheme ("pipe dream"?
*grins*), is the unknown time delay introduced by the scanner. I
believe we would need a timeout on the connection, or at perhaps a
"keepalive" scheme? Do we keep processing other messages, while waiting
for the scanner to return (parallel processing)? Or is it to be an
"inline" (serial processing) type function?? How are current ACL tests
handled? I defer to other experts in these areas.
Just another thought, could we make the timeout be a function of the
original message size? That way, we can have shorter, more reasonable
timeouts for ordinary messages, but let the scanner crunch a bit longer
on larger messages.
Just my two pennies...
Jim Roberts
Punster Productions, Inc.