Re: [exim-dev] Questions about integration of exim with rspa…

Top Page
Delete this message
Reply to this message
Author: Phil Pennock
Date:  
To: Vsevolod Stakhov
CC: exim-dev
Subject: Re: [exim-dev] Questions about integration of exim with rspamd spam filter
On 2012-09-12 at 20:28 +0400, Vsevolod Stakhov wrote:
> I've tested this patch on both SA and rspamd and it seems to be
> working fine. But it can be surely improved by extending rspamd reports.


> +++ b/src/src/spam.c
> -           "REPORT SPAMC/1.2\r\nUser: %s\r\nContent-length: %ld\r\n\r\n",
> -           user_name,
> -           mbox_size);
> -
> -  /* send our request */
> -  if (send(spamd_sock, spamd_buffer, Ustrlen(spamd_buffer), 0) < 0) {


Sends a very bounded amount of data; user_name is under administrator
control and mbox_size scales log10(n), so the total size will be fine
even with petabyte-sized emails.

> +    r = 0;
> +    r += spam_push_line(request_v, request_p++, "CHECK RSPAMC/1.3\r\n");
> +    r += spam_push_line(request_v, request_p++, "Content-length: %lu\r\n", mbox_size);
> +    r += spam_push_line(request_v, request_p++, "Queue-Id: %s\r\n", message_id);
> +    r += spam_push_line(request_v, request_p++, "From: <%s>\r\n", sender_address);
> +    r += spam_push_line(request_v, request_p++, "Recipient-Number: %d\r\n", recipients_count);
> +    /* copy all recipients as well */
> +    for (i = 0; i < recipients_count && r < 55; i ++)
> +        r += spam_push_line(request_v, request_p++, "Rcpt: <%s>\r\n", recipients_list[i].address);


Sends an amount of data which can grow large, the amount under an
attacker's control. Each email address can be 320 octets; sender and 55
(which is a magic number derived manually from another magic number used
for sizing the array, instead of a compile-time calculation) recipients,
so we can easily be over 17kB just for this request.

> +    if (writev(spamd_sock, request_v, request_p) < 0) {
> +        (void)close(spamd_sock);
> +        log_write(0, LOG_MAIN|LOG_PANIC,
> +            "spam acl condition: spamd (rspamd) send failed: %s", strerror(errno));
> +        (void)fclose(mbox_file);
> +        (void)close(spamd_sock);
> +        return DEFER;
> +    }


This error-handling approach, which we could (just) get away with for
send() with a smaller amount of data, is no longer safe. Simply because
we can be waiting on a round-trip, EINTR is more common. A short write
becomes more likely.

You need to capture the amount of data written by writev(), handle short
writes and repair this, _because_ you're sending an arbitrarily large
amount of data. Also, the "send" in the error message is now
misleading.

You can change the spamassassin variant to use writev() too, so that
things are more consistently in the style you prefer -- that's better
than duplicated logic.

> +static int
> +spam_push_line(struct iovec *iov, const int i, const char *fmt, ...)
> +{
> +  va_list ap;
> +  size_t len;
> +  char buf[512];
> +
> +  if (i >= 64) {
> +    log_write(0, LOG_MAIN, "rspam: %s: index out of bounds", __FUNCTION__);


Nope, sorry, __FUNCTION__ is Gcc-specific and Exim isn't tied to Gcc.
Folks still build on older compilers for older platforms. We mandate
C89 or greater, and have some portability hacks as needed. It's okay to
have macros which have _more_ functionality on certain compilers, as
long as they degrade gracefully to still work elsewhere and this is
handled when the macro is defined.

C99 gives us __func__, we could definitely do something like pull in the
example from http://gcc.gnu.org/onlinedocs/gcc/Function-Names.html into
a header somewhere, so you could use __func__. If you do want to have
__func__, then please add this to exim.h as part of your patch:

     #if __STDC_VERSION__ < 199901L
     # if __GNUC__ >= 2
     #  define __func__ __FUNCTION__
     # else
     #  define __func__ "<unknown>"
     # endif
     #endif


That works for me: lets us write with the standard name, explicitly
checks with a supposedly portable guard, we can handle undefined
__STDC_VERSION__ if needed, and is portable to older gcc without tying
the code to gcc.

- -Phil