Re: which is more secure?

Top Page
Delete this message
Reply to this message
Author: T. William Wells
Date:  
To: exim-users
Subject: Re: which is more secure?
In any mailer, the general flow is:

    1) accept the mail from the outside
    2) optionally, place the mail in a queue
    3) decide how to deliver the mail
    4) deliver the mail


1) can be done under any uid (after the initial socket bind)
2) must be done under the queue's owner or group
3) must have privileges needed to examine routing files
4) must have privileges to write to necessary files or exec necessary programs

Now, here is the thing: since the mail goes through all of these
steps, somewhere along the line, the privileges must be altered
between those states. And, except in the cases where the
privileges are a fixed function (e.g., root or the exim user id)
they must be computed.

*In any mailer* this is a potential security hole.

The only effective remedy for this is careful coding. It doesn't
really matter whether the code is sendmail's can of worms or
exim's, nor what the precise security model is; if the mailer
has to make a decision as to which privileges to grant, it
might be tricked into making the wrong decision by some
coding error.

Here's a simple example: Suppose that stage 2 can be tricked into
copying a random file into the spool area. That random file might
look like a real mail message which comes from a trusted source
and which then directs stages 3 or 4 to violate security.

You could run each and every part with the absolute minimum
privileges needed, ensure that all transitions are carefully
controlled -- and still be shit out of luck.

The point of this? To make a secure mailer, you need two things:

     1) A security model which can be verified to be correct and
    which the code can be validated against.


     2) Code that is, *at all stages*, validated for code
    correctness and conformity to the security model.


Presumably, exim's security model is adequate to the task. Thus
the only issue is whether the code implements that model. And the
only way to be sure is validate each line of code and then each
algorithm and then the entire program flow. This task can be made
easier and more reliable by *coding standards* -- that is, a
deliberate restriction of use of features of the language. In the
ideal case, each and every statement should be on-its-face valid,
meaning that it implements, *always*, the function that is implied
by the statement. In the real world, each statement should either
function as the ideal case or fail in such a way as to forbid
propagation of the error.

Consider this code:

    strcat(a, b);


This statement can fail silently. Without an examination of
potentially the entire program, there is no way to validate that
it works correctly.

Here's another:

void
my_strcat(char *dst, size_t len, const char *src)
{
    size_t  len2;


    if (len <= sizeof(char *)) {
        longjmp(ohshit, 1);
    }
    len2 = strlen(dst);
    if (len2 = strlen(src) + 1 > len) {
        longjmp(ohshit, 1);
    }
    strcpy(dst + len2, src);
}


#define STRCAT(dst,len,src) my_strcat(dst, sizeof(dst), src)

    STRCAT(a, b);


In this case, we're creating a "strcat" that is to protect against
overwrite errors. Note that there are actually two protections
here. Since C doesn't really distinguish between pointers and
arrays, there's a run-time check to distinguish between the two
cases (with the implicit assumption that all buffers are longer
than a character pointer) as well as a check to ensure that you
don't overflow the buffer. Note also that an overflow will result
in the program refusing to process the message further, thus
eliminating data-truncation holes.

There are the obvious costs to this approach. But the benefit is
very important: code that, on its face, doesn't violate the
language model. Another goal to be obtained by this type of coding
is ensuring that no undefined behavior occurs. An extreme version
of this would forbid the use of bare signed variables -- as their
overflow behavior isn't fixed by the language.

Sure, the devil's advocate may answer, but I've only really moved
the failure modes to where they're more invisible. Well, yes and
no. If there's a flaw in this particular coding idea, it'll be
really tricky to spot. But it will be *in one place* -- the design
of this particular piece of code. Not spread all throughout the
program.

In some sense, C is the wrong language for writing secure
programs. It's a bare-metal language, allowing you to do damned
near anything. I happen to like that freedom but the price is that
incorrectly coded programs can behave in entirely unpredictable
ways and because C really doesn't try to help protect here, it's
difficult to create programs that are knowably correctly coded.