Re: [Exim] Re: closed connection in response to STARTTLS.

Top Page
Delete this message
Reply to this message
Author: Philip Hazel
Date:  
To: David Woodhouse
CC: exim-users
Subject: Re: [Exim] Re: closed connection in response to STARTTLS.
On Wed, 24 Apr 2002, David Woodhouse wrote:

> No. I only know that the child Exim tried STARTTLS on the socket it
> inherited from the parent, which means _someone_ did something wrong.


OK, I finally see what you are complaining about. Sorry for being so
slow. However, I do not believe that the RFC covers this case, which
looks like this:

    connection established
    C: EHLO
    S: acks, and advertises support for STARTTLS
    C: STARTTLS
    S: OK
    TLS negotition succeeds, session is now encrypted
    C: EHLO
    S: acks, does NOT advertise STARTTLS
    .... session proceeds ....
    Client calls OpenSSL function to shut down encryption
    Client and server negotiate termination of encryption
    Session is now running in clear
    C: STARTTLS


Is that STARTTLS command legal or not? All I can find in the RFC is this:

Both the client and the server MUST know if there is a TLS session
active. A client MUST NOT attempt to start a TLS session if a TLS
session is already active. A server MUST NOT return the TLS extension
in response to an EHLO command received after a TLS handshake has
completed.

Well, both ends do know if there is a TLS session active (there is not,
because it's been closed down). The client is not sending STARTTLS
within a TLS session. The server isn't advertising STARTTLS when it
shouldn't. Note that Exim does not send another EHLO before the second
STARTTLS.

As far as I can see, the RFC does not cover the case of a server and
client deciding to terminate a TLS session. (I agree, of course, that
this is a grey area. :-)

> If, as is usually reasonable, we assume that Exim is behaving correctly,
> then we must conclude that the Postfix server to which my box was talking
> must have advertised STARTTLS in the EHLO response immediately following the
> TLS setup -- which is explicitly forbidden by RFC 3207.


Hmm. I guess that reading is based on the general rule that the client
must not try any option that isn't advertised? Certainly RFC 2487
doesn't seem to say "thou shalt not send STARTTLS unless the response to
the most recent EHLO advertised it". Oh, I seen that 3207 obsoletes
2487. I hadn't noticed that .... goes away to read .... no, it doesn't
seem to add anything in this regard.

> > No, Exim doesn't work like that. At least, not unless you've
> > configured it "unusually", that is, with queue_smtp_domains or using
> > -odqs or -qq.
>
> Or queue_only_load.


No. NOT queue_only_load (or plain queue_only). Those options just put
the message on the queue. They do NOT do the routing to determine which
hosts the message is destined for. Only queue_smtp_domains or -odqs or
using a -qq queue run do that.

> I tried that; I didn't like it. Now I set queue_only_load :)


I repeat: that will NOT record which hosts messages are destined for.
Each message will be delivered independently, but from queue runs
instead of in immediate delivery processes. (Unless you use -qq to start
a queue run, as previously stated.)

> This I understand. Having the queue runner process inherit sockets from the
> server is an optimisation for precisely that 5% about which we don't care,
> right? The percentage of cases where the the socket can be reused and the
> parent Exim process had actually been using TLS on said socket is even lower.


Presumably.

But for the record:

We are using different terminology here. It is not a queue runner
process that inherits the socket. It is a _delivery_ process, for one
message. (That process may, in turn, pass the socket on to another
delivery process.) Also, the inheritance is from another delivery
process, not from "the server". Exim has no central "server".

> OTOH, this optimisation has been observed to cause mail to fail to get
> delivered, because it causes Exim to behave in a way which although not
> explicitly forbidden is at best dubious - from my skim-reading of the RFC I
> would not infer that taking down TLS and continuing the session is
> permitted.


It may cause mail to be delayed, but I don't believe it causes mail to
fail to be delivered.

> So forget taking down TLS, sending RSET and hoping the other end either
> copes with that or kicks us off immediately for it without causing our next
> delivery attempt to fail. Let's just (QUIT and) close the socket and forget
> about it if we've been doing TLS over it, rather than passing it to the
> child.


That you can already achieve by setting hosts_nopass_tls = *. The
problem with making this the default is that it can be pessimal in cases
where passing the socket works. These are cases such as dial-in hosts
that are sending a whole pile of messages to a smarthost, and need to
use TLS in order to do AUTH in a secure manner. People managing these
kinds of host are often, how shall I put it?, "less experienced".

I'm going to add the RSET test. If the server drops the connection, Exim
will behave as if it had done QUIT in the first place. Of course, we may
find that Postfix accepts RSET, and still objects to the STARTTLS. I
don't actually know of a Postfix server that advertises STARTTLS to test
this with.

> IOW, I think the semantics of the hosts_nopass_tls option need reversing -
> we should explicitly list the hosts which _will_ accept this behaviour,
> rather than the hosts which won't. Would you accept that?


Choosing defaults in this kind of case is always difficult. I note that
there hasn't been a huge swell of contributions to this thread. Does
that mean that the rest of the Exim list doesn't know or care about this
issue?

While I am sympathetic to that point of view, I am not at all keen on
making incompatible changes (of any kind) if at all possible. They
always bite somebody.

So:

Is there anybody else on this list who has views on this issue? The
options are:

1. Do nothing (but I don't think that's actually realistic).
2. Add the RSET test, but do nothing else.
3. Change the default for hosts_nopass_tls to be *.
4. Replace hosts_nopass_tls with hosts_pass_tls, defaulting unset,
5. Abolish hosts_nopass_tls and never pass on the socket.

For myself, I don't like either 1 or 5 (I tend to avoid extremes).

--
Philip Hazel            University of Cambridge Computing Service,
ph10@???      Cambridge, England. Phone: +44 1223 334714.