[exim] Fixing LDAP timeout handling

Top Page
Delete this message
Reply to this message
Author: Michael Haardt
Date:  
To: exim-users
Subject: [exim] Fixing LDAP timeout handling
Hello,

LDAP queries offer two parameters: CONNECT and TIMELIMIT. CONNECT
is supposed to apply to connect() calls for LDAP connections, whereas
TIMELIMIT should set to server-side search limits. With OpenLDAP 2,
CONNECT does not work and TIMELIMIT may cause unexpected behaviour.

CONNECT is only supposed to work for Netscape SDK, specifically
for systems where LDAP_X_OPT_CONNECT_TIMEOUT is defined. OpenLDAP uses
a different option called LDAP_NETWORK_TIMEOUT. Adding it, things
work like this:

ldap_initialise()
ldap_set_option()
ldap_bind_s() (using CONNECT timeout now to establish the connection,
               but no further timeouts for hanging connections)
ldap_search()
ldap_result() (using TIMELIMIT timeout for network communication?)


As a result, there are two places where hung connections might block
Exim processes. If they run as part of a queue runner, the queue runner
is blocked.

Unfortunately, ldap_bind_s() does not use the network timeout option,
because *_s operations don't do that. There is no *_st version either,
so I had to code my own version to use ldap_bind() and ldap_result().

Further TIMELIMIT does not just set the server-side limit option, but is
a hard limit on the query time including network latencies. Not good,
plus ldap_result() will also block on a hung connection if no server-side
query timeout is used.

I hacked the code to use ldap_bind() and ldap_result() and to use the
CONNECT timeout for connect() and ldap_result(). CONNECT is not the right
name anymore, perhaps it should be renamed as TIMEOUT or NETWORKTIMEOUT,
to distinguish it from TIMELIMIT.

I append my patch, but please check if the new bind code still works
correctly. I would like as much people as possible to test this before
applying it to the official sources. Any bugs inside the bind code
may cause things to break badly or even open security holes. The current
code allows DoS attacks with long-lasting results, like blocking all
queue runners, though. Just to motivate potential testers. ;-)

Michael
----------------------------------------------------------------------
--- src/lookups/ldap.c.orig    2004-10-28 16:22:08.000000000 +0200
+++ src/lookups/ldap.c    2004-11-02 12:34:59.000000000 +0100
@@ -70,13 +70,6 @@
 #endif



-/* For libraries without TCP connect timeouts */
-
-#ifndef LDAP_X_IO_TIMEOUT_NO_TIMEOUT
-#define LDAP_X_IO_TIMEOUT_NO_TIMEOUT (-1)
-#endif
-
-
/* Four types of LDAP search are implemented */

 #define SEARCH_LDAP_MULTIPLE 0       /* Get attributes from multiple entries */
@@ -134,7 +127,7 @@
   password      password for authentication, or NULL
   sizelimit     max number of entries returned, or 0 for no limit
   timelimit     max time to wait, or 0 for no limit
-  tcplimit      max time to connect, or NULL for OS default
+  tcplimit      max time to connect, or 0 for OS default
   deference     the dereference option, which is one of
                   LDAP_DEREF_{NEVER,SEARCHING,FINDING,ALWAYS}


@@ -374,9 +367,20 @@
in Netscape SDK v4.1; I don't know about other libraries. */

   #ifdef LDAP_X_OPT_CONNECT_TIMEOUT
-  ldap_set_option(ld, LDAP_X_OPT_CONNECT_TIMEOUT, (void *)&tcplimit);
+  if (tcplimit > 0)
+    {
+    unsigned int timeout=tcplimit*1000;
+    ldap_set_option(ld, LDAP_X_OPT_CONNECT_TIMEOUT, (void *)&timeout);
+    }
   #endif


+  /* Set the TCP connect timeout.  This works with OpenLDAP 2.2.14. */
+
+  #ifdef LDAP_OPT_NETWORK_TIMEOUT
+  if (tcplimit > 0)
+    ldap_set_option(ld, LDAP_OPT_NETWORK_TIMEOUT, (void *)timeoutptr);
+  #endif 
+
   /* I could not get TLS to work until I set the version to 3. That version
   seems to be the default nowadays. The RFC is dated 1997, so I would hope
   that all the LDAP libraries support it. Therefore, if eldap_version hasn't
@@ -441,6 +445,15 @@
       host, porttext);
   }


+/* Whatever follows, obey this timeout in any requests. */
+
+if (tcplimit > 0)
+ {
+ timeout.tv_sec = tcplimit;
+ timeout.tv_usec = 0;
+ timeoutptr = &timeout;
+ }
+
/* Bind with the user/password supplied, or an anonymous bind if these values
are NULL, unless a cached connection is already bound with the same values. */

@@ -455,23 +468,40 @@
   {
   DEBUG(D_lookup) debug_printf("%sbinding with user=%s password=%s\n",
     (lcp->bound)? "re-" : "", user, password);
-  if ((rc = ldap_bind_s(lcp->ld, CS user, CS password, LDAP_AUTH_SIMPLE))
-       != LDAP_SUCCESS)
+  if ((msgid = ldap_bind(lcp->ld, CS user, CS password, LDAP_AUTH_SIMPLE))
+       == -1)
     {
-    /* Invalid credentials when just checking credentials returns FAIL. This
-    stops any further servers being tried. */
+    *errmsg = string_sprintf("failed to bind the LDAP connection to server "
+      "%s%s - LDAP error", host, porttext);
+    goto RETURN_ERROR;
+    }


-    if (search_type == SEARCH_LDAP_AUTH && rc == LDAP_INVALID_CREDENTIALS)
-      {
-      DEBUG(D_lookup)
-        debug_printf("Invalid credentials: ldapauth returns FAIL\n");
-      error_yield = FAIL;
-      goto RETURN_ERROR_NOMSG;
-      }
+  if ((rc = ldap_result( lcp->ld, msgid, 1, timeoutptr, &result )) <= 0)
+    {
+    *errmsg = string_sprintf("failed to bind the LDAP connection to server "
+      "%s%s - LDAP error: %s", host, porttext, rc == -1 ? "result retrieval failed" : "timeout" );
+    result = NULL;
+    goto RETURN_ERROR;
+    }
+
+  rc = ldap_result2error( lcp->ld, result, 0 );
+
+  /* Invalid credentials when just checking credentials returns FAIL. This
+  stops any further servers being tried. */


-    /* Otherwise we have a problem that doesn't stop further servers from being
-    tried. */
+  if (search_type == SEARCH_LDAP_AUTH && rc == LDAP_INVALID_CREDENTIALS)
+    {
+    DEBUG(D_lookup)
+      debug_printf("Invalid credentials: ldapauth returns FAIL\n");
+    error_yield = FAIL;
+    goto RETURN_ERROR_NOMSG;
+    }


+  /* Otherwise we have a problem that doesn't stop further servers from being
+  tried. */
+
+  if (rc != LDAP_SUCCESS)
+    {
     *errmsg = string_sprintf("failed to bind the LDAP connection to server "
       "%s%s - LDAP error %d: %s", host, porttext, rc, ldap_err2string(rc));
     goto RETURN_ERROR;
@@ -482,6 +512,9 @@
   lcp->bound = TRUE;
   lcp->user = (user == NULL)? NULL : string_copy(user);
   lcp->password = (password == NULL)? NULL : string_copy(password);
+
+  ldap_msgfree(result);
+  result = NULL;
   }


/* If we are just checking credentials, return OK. */
@@ -526,13 +559,6 @@
/* Loop to pick up results as they come in, setting a timeout if one was
given. */

-if (timelimit > 0)
-  {
-  timeout.tv_sec = timelimit;
-  timeout.tv_usec = 0;
-  timeoutptr = &timeout;
-  }
-
 while ((rc = ldap_result(lcp->ld, msgid, 0, timeoutptr, &result)) ==
         LDAP_RES_SEARCH_ENTRY)
   {
@@ -914,7 +940,7 @@
 BOOL defer_break = FALSE;
 int timelimit = LDAP_NO_LIMIT;
 int sizelimit = LDAP_NO_LIMIT;
-int tcplimit = LDAP_X_IO_TIMEOUT_NO_TIMEOUT;
+int tcplimit = 0;
 int dereference = LDAP_DEREF_NEVER;
 int sep = 0;
 uschar *url = ldap_url;
@@ -947,7 +973,7 @@
       else if (strncmpic(name, US"PASS=", namelen) == 0) password = value;
       else if (strncmpic(name, US"SIZE=", namelen) == 0) sizelimit = Uatoi(value);
       else if (strncmpic(name, US"TIME=", namelen) == 0) timelimit = Uatoi(value);
-      else if (strncmpic(name, US"CONNECT=", namelen) == 0) tcplimit = Uatoi(value) * 1000;
+      else if (strncmpic(name, US"CONNECT=", namelen) == 0) tcplimit = Uatoi(value);


       /* Don't know if all LDAP libraries have LDAP_OPT_DEREF */