--
I've spent quite a bit of time tracking this down -- probably
more than needed since I am not a C programmer. No one else
seems to have run into it.
Nevertheless, here is the situation.
Debian Linux Kernel 2.4.14
exim 3.34
OpenLDAP 2.0.21
I use LDAP probably too many places. In this particular scenario
I expand aliases using LDAP and I use LDAP to get uid/gid
information for delivery. Using the current src/lookups/ldap.c,
this results in the following error where an alias expands to
more than one local address:
exim: openldap2-2.0.21/libraries/libldap/error.c:221: ldap_parse_result:
Assertion `r != ((void *)0)' failed.
2002-04-04 11:11:52 queue run: process 8863 crashed with signal 6 while
delivering 16tAbg-0002C7-00
In cases like this, exactly one local delivery succeeds per
attempt. Attempts are made with each attempt resulting in one
successful local delivery until all deliveries succeed.
In tracking this down, I saw that exim forks a process for the
delivery, calls "search_tidyup", closes the LDAP connection,
rejoins the parent process, and attempts to use the closed LDAP
connection to help with the next delivery. At least this is
what I think I saw. I wasn't able to pinpoint the fork, but it
should be in transports.c. Here is the relevant output of "-d9
-N", with some slight instrumenting of the code that I added:
*** delivery by filtered_delivery_maildir transport bypassed by -N option
search_tidyup called
unbind LDAP connection to LDAP_SERVER:389
(1) (10089) tidy host: LDAP_SERVER
journalled user1@???
filtered_delivery_maildir transport returned OK for user1@???
post-process user1@??? (0)
user1@??? succeeded: adding to nonrecipients list
LOG: 0 MAIN
*> user1 <alias@???> D=filtered_addr_to_local T=filtered_delivery_maildir
locked /var/spool/exim/db/retry.lockfile
opened DB file /var/spool/exim/db/retry: flags=0
dbfn_read: key=T:user2@???
search_open: ldap "NULL"
cached open
search_find: file="NULL"
key="LDAP_LOOKUP_URL" partial=-1
LRU list:
internal_search_find: file="NULL"
type=ldap key="LDAP_LOOKUP_URL"
database lookup required for LDAP_LOOKUP_URL
LDAP parameters: user=NULL pass=NULL size=0 time=0
perform_ldap_search: ldap URL ="LDAP_LOOKUP_URL" server=LDAP_SERVER port=389 sizelimit=0 timelimit=0
(2) (10087) check host: batman.everybody.org
(10087) use host: batman.everybody.org
Re-using cached connection to LDAP server LDAP_SERVER:389
search ended by ldap_result yielding -1
(3) exim: openldap2-2.0.21/libraries/libldap/error.c:221: ldap_parse_result: Assertion `r != ((void *)0)' failed.
Notes:
(1) PID 10089 releases the connection on the global
ldap_connections list. I'm not a C programmer, I thought
that after the fork, you were supposed to have a seperate
namespace, hence releasing the connection here wouldn't
release it in the parent process, but
(2) the parent process (PID 10087) check for open connections,
finds one, and attempts to reuse the same connection we just
closed in the child. This connection
(3) is closed, so the attempt to use it fails.
I went and checked OpenLDAP, and found that it has (at least in
2.0.21+) connection caching built into the library. I thought,
at first, that this might be the cause of the problem -- a
conflict of caching code -- so I rebuilt the libraries without
"--enable-cache". This didn't change anything.
The next step was to force the ldap_connections list to NULL just
before the check for cached connections. This worked.
I've patched ldap.c (which I notice is the same in exim4, except
for the copyright date) to remove the connection caching code.
OpenLDAP has this in the libraries and I would imagine that
Netscape and Solaris provide similar caching. This also
eliminates the global var there.
Patch attached. I'm sure there are problems -- I keep telling
you I'm not a C programmer -- and it might introduce too much
overhead in older OpenLDAP libraries and other LDAP libraries, but
I hope it is useful to someone.
Mark.
--
Find inner peace and ten thousand around you shall be saved.
-- St. Seraphim of Sarov
--
Content-Description: patch against ldap.c
[ Content of type text/x-patch deleted ]
--