Re: [exim-dev] Questions about integration of exim with rspamd spam filter

Author: Phil Pennock
Date:
To: Todd Lyons
CC: exim-dev, Vsevolod Stakhov
Subject: Re: [exim-dev] Questions about integration of exim with rspamd spam filter

On 2012-09-11 at 10:40 -0700, Todd Lyons wrote:
> On Mon, Sep 10, 2012 at 11:13 AM, Vsevolod Stakhov
> <vsevolod@???> wrote:
> > I've seen the topic in this mailing list related to integration of
> > rspamd and exim. I'm the author of rspamd and I'd really very
> > appreciate if exim will be able to interact with rspamd natively.

I think rspamd support is a good idea, as long as the code fits. I'd be
very happy to see this in the 4.81 release (whenever it happens).

> > 2) Why do you think that push_line function in the current patch is
> > not good for exim (it uses writev for effective data transmission)?
>
> It's not that it's "not good for exim", rather it's just different
> than the way it's normally done in exim. You chose to use a r +=
> push_line() way of adding to a string (adding to a va_item array),
> whereas in exim everything else tends to either create it all in one
> command (i.e the string_format() in the current spam() function) or
> using string_snprintf().
>
> Personally I am not opposed to your method, it's just different and
> "new", and so requires some thinking to make sure it doesn't introduce
> any security holes. My c-fu is not strong enough to know for sure if
> this is the most correct way, I have to defer that to others.

writev() was standardised in 2001, Exim predates that, so doesn't tend
to include writev() in places where it might make sense. We do use
writev() in src/auths/pwcheck.c and src/malware.c for mksd support.
That support is not feature-conditional upon mksd, so as long as
WITH_CONTENT_SCAN is defined, it will work.

Frankly, I think that as long as return values are checked, writev() is
fine. If it causes compilation issues for an actual platform where Exim
is used, we'll tackle that later. In fact, I can ask about that on
exim-users. I say, stick to writev().

Having a consistent code style for a project is nice, but that's
_because_ it reduces the support burden for the codebase as a whole.
Where particular pieces of code are maintained by folks from another
project, then as long as the code is clean and understandable, fitting
into the general model of Exim's runtime (ie, not multi-threaded, etc)
then I think that we should be fairly lenient about Exim's internal API
usage.

In fact, Exim is already using different styles for the integration code
for other pieces of software. That might not be a _good_ thing, but
it's the current practice and we should make more of an effort to clean
that up before we impose new rules for contributors, as otherwise we
only drive away most help.

My opinion is that if the maintainer of the external software is helping
to maintain the interface code in Exim, then that counts for a _lot_.

> > 3) Can I introduce some new global variables like <spamd_action> -
> > this can be useful as rspamd has 'actions': policies for recommended
> > action with this email (like 'accept', 'reject', 'add header' etc)?
>
> One of the suggestions by Phil was to add a new global variable or
> /modifier (to the spam call) or some other way to indicate the
> different calling requirements. A similar method would be required to
> indicate the different return requirements. I'm open for conversation
> on this one. We need to establish exactly what settings and
> conditions to add, and what other configuration adjustments are
> needed.

We should not require the administrator to set two knobs to be able to
correctly call an external API, where only setting one knob leaves
things inconsistent. The use of the knob changes the API variant in
use, which affects both how parameters are passed and how return values
are interpreted.

As I suggested before, my opinions are not hard rules, merely my
opinions, so the actual decisions should be left to the person doing the
actual coding work, and the person working with them to integrate into
Exim. With that in mind, here are my opinions.

In this case, I think that the ACL modifier suggestion of mine was
_wrong_. As was spamd_options. Sorry.

The API language used is a property of the remote system being talked
to. It's not a parameter of the call, but something inherently tied to
the connection and the expected remote server. It is a mistake to be
sometimes using one language and sometimes another when talking to the
same remote server.

I think that the best approach is to modify the "spamd_address" option
parsing, to take a "variant=<spamassassin|rspam>" modifier on an
end-point, with spamassassin being the default (for backwards
compatibility). So the admin should be able to say:

  spamd_address = /var/run/rspam_socket variant=rspam
or:
  spamd_address = 192.168.2.10 783 variant=rspam : \
                  192.168.2.11 783 variant=rspam : \
                  192.168.2.12 783 variant=rspam

Note that spamd_address will be expanded _IF_ it starts with a
dollar-sign, so it's possible for someone to construct a complicated
rule which will sometimes result in calling SpamAssassin and sometimes
result in calling rspam. This is why "spamd_options" would be bad:
the administrator keeping the two Exim-level options in sync across
expansions would be very error-prone.

The expansion is already done. If the expansion differs, we'll not use
the cached result, so "how does someone call both SA and rspamd?" is
solved.

There will be a little bit of mangling needed to change how the option
details are extracted from the string, for the two cases. All easy,
albeit not as abstracted away for simplicity as it could be. :/

-Phil

This message is part of the following thread:
	the complete thread tree sorted by date
	Todd Lyons at
	Vsevolod Stakhov at

Re: [exim-dev] Questions about integration of exim with rspa…