[Exim] local_scan interface discussion

Autor: Derrick 'dman' Hudson
Fecha:
A: exim-users
Asunto: [Exim] local_scan interface discussion

--

[
If this should be discussed in a separate (new) forum, let me know
and we'll set something up. Otherwise please keep the discussion
properly threaded so people can follow or ignore it as they wish.
]

My reason for writing this is to facilitate collaboration to design a
"perfect" local_scan interface. Philip has the final say regarding
what will or will not be included in the official exim release, of
course. I want to help make the interface meet my desires, and also
to help implement it (now :-)) as I am able.

Please contribute your $0.02 (in whatever monetary unit you deem most
appropriate).

The Problem :
The current problem with the local_scan API is that it requires exim
to be recompiled when ever the scanner is changed. This is annoying
when the scanner changes often or for experimenting with different
scanners, and runs directly counter to distributions of pre-packaged
software. A better implementation will decouple the scanner from the
rest of exim.

The Goal :
The goal of this discussion is to develop an design for the
implementation of the local_scan function which decouples the scanner
from exim proper.  The goal of this decoupling is
    1)  increased flexibility wrt choosing a scanner
    2)  allow scanners to be (re)compiled without rebuilding the rest of exim
    3)  allow distributions (eg RedHat or Debian) to provide separate
            packages for exim and each of the available scanners
    4)  allow users of a distribution to use a scanner while still
            using the packaged version of exim

Another goal is for the design (and implementation) to be acceptable
to Philip and included in the standard exim release. This is
important for supporting goals #3 and #4 above.

I think some basic premises must be agreed upon :

o   Incompatible changes will occur from time to time.  Preventing
    these changes from disrupting anything is not realistic.  (eg
    exim4 breaks compatibility with all exim3 config files)

o   When (not if) an incompatibility arises, it is desirable to to
    detect, report, and gracefully handle it.

o   Following the XP philosophy, the simplest method that works is
    best.  I would like to keep the implementation as small and as
    simple as possible, while still meeting the goals.

To start with, I'll list some ideas I have which I don't think are
very realistic. I am doing this so that they can be "vetoed" right
from the beginning.

o   Use CORBA or XML-RPC or some such middleware to communicate
    between exim and the scanner.

    Pros:
        o   on-the-wire protocols are standard
        o   implementation libraries are available
        o   allows scanners to be written in any language
        o   eliminates C-level binary compatibility concerns

    Cons:
        o   would require too much complexity in exim itself to
            implement its side of the communication

o   Embed a high-level language interpreter (eg perl, python, or
    java), and let it dynamically load modules and whatnot

    Pros:
        o   eliminates C-level binary compatibility concerns
        o   allows (forces, rather) local_scan functions to be written
            in a language other than C

    Cons:
        o   complexity
        o   increases the size of "exim" since it would contain an
            extra interpreter
        o   increases performance overhead due to startup requirements
            of the extra interpreter
        o   stirs up language wars

o Same as above, but implement the interpreter ourselves

    Pros:
        o   eliminates C-level binary compatibility concerns
        o   allows (forces, rather) local_scan functions to be written
            in a language other than C
        o   avoids existing language wars

    Cons:
        o   way too much complexity
        o   who has time to create or learn Yet Another high level
            language anyways?
        o   could create a new language war

On a more practical level, I have these ideas :

o   Treat the local_scan the same way other (dynamic) libraries, such
    as libldap or libpg, are treated.  Let the system's dynamic linker
    deal with loading the local scan library at runtime.

    Pros:
        o   eliminates C-level binary compatibility concerns
        o   eliminates the need to write code dealing with dlopen(), etc.

    Cons:
        o   only allows a single liblocal_scan to be installed at any
            time (AFAIK)

    Questions :
        o   How would a system with multiple local_scan libraries
            installed behave?
        o   How would the admin specify which one to use?

These last two ideas are the most practical, I think.

o   Create an interface that leverages existing IPC mechanisms such as
    pipes, UNIX Domain Sockets (these are the same as fifos and named
    pipes, right?), or TCP sockets to communicate with a scanner.  The
    scanner would be a separate, complete, application.

    Pros:
        o   eliminates C-level binary compatibility concerns
        o   allows local_scan functions to be written in any language
        o   prevents language wars (as embedding an interpreter would create)

    Cons:
        o   Requires creating a new protocol.
            (or beating an existing one (maybe LMTP or BSMTP) into a
             shape suitable for this use)
        o   Could be complex.

    Comments:
        o   The complexity of creating and implementing a new protocol
            can be minimized by devising a sufficiently simple
            protocol.
        o   If this mechanism is chosen, then additional discussion on
            the merits of each IPC mechanism and protocol choices will
            need to follow.

    Additional Data:
        One idea I had for this is using a pipe.  exim would open a
        pipe to the specified scanner program.  The message would be
        passed to the scanner on stdin.  The exit code from the
        scanner program would determine what exim should do with it --
        accept, tempreject, permreject.  If the scanner rejects the
        message, its output would be the message to return to the
        other server.  Otherwise its output would specify how the
        message should be modified (namely adding or modifying
        headers).  The format of the header modification text is a
        detail that can be worked out later.

o Use libdl (dlopen, dlsym) to load an admin-specified .so.

    Pros:
        o   doesn't require a lot of code
        o   an initial implementation is already available
        o   the scanner API is almost identical to the current one
        o   no new protocols need to be devised

    Cons:
        o   C is (very apparently) not well suited for dynamic programs
        o   the libdl API doesn't provide any type checking the way
            the C compiler does (or the way python does for "dynamic"
            modules)
        o   This makes it easy for an admin to shoot a 3-sided hole in
            exim.  If a bad .so is specified (accidentally or
            maliciously), exim _could_ have a hard time handling it
            gracefully.  It will more than likely crash if the ABI
            checking doesn't catch the mismatch.

-D

--
If we claim we have not sinned, we make Him out to be a liar and His
Word has no place in our lives.
        I John 1:10

http://dman.ddts.net/~dman/
--
[ Content of type application/pgp-signature deleted ]
--