[Exim] Problem with callout caching

Autor: Lionel Elie Mamane
Data:
A: exim-users
Assumpte: [Exim] Problem with callout caching

I have this setup where the I do sender callout, but the options to
the callout depend on the _recipient_ domain (some users want
postmaster callout, other don't).

This interacts badly with the callout cache leading to wrong results
(even with no_cache, due to the process-live memory cache), and that
the only the options to the *first* callout are honoured.

For the sake of example:

strict.org: sender callout with postmaster option
lax.org: sender callout without postmaster option

First scenario:

 MAIL From:<foo@???>
 250 OK
 RCPT To:<user@???>
 # Here, postmaster callout is done. Let's suppose it fails
 # debug output:
 #  check !verify = sender/callout=no_cache,postmaster,random
 550-Postmaster verification failed while checking <modelec@???>
 550-Called:   0.0.0.0
 550-Sent:     RCPT TO:<postmaster@???>
 550-Response: 550 5.2.1 users account disabled
 550-Several RFCs state that you are required to have a postmaster
 550-mailbox for each mail domain. This host does not accept mail
 550-from domains whose servers reject the postmaster address.
 550 Sender verify failed
 RCPT To:<user@???>
 # debug output
 #  check !verify = sender/callout=no_cache,random
 #  using cached sender verify result
 550 Sender verify failed
 # Should 250-Accept!

Second scenario:

MAIL From:<foo@???>
250 OK
RCPT To:<user@???>
# debug output
# check !verify = sender/callout=no_cache,random
250 Accepted
RCPT To:<user@???>
# debug output:
# check !verify = sender/callout=no_cache,postmaster,random
# using cached sender verify result
250 Accepted
# Should do postmaster callout and 550-refuse!

What IMHO is the correct thing to do (when sender callout is
activated) is:

int callout(address)
{
 result = lookup (callout_cache, address);
 if (result == NOT_IN_CACHE)
    result = do_callout(address);
 if (result != DEFER)
    add (callout_cache, address, result);
 return result
}

which should be called in the following way:

 if (must_do_callout)
 {
   result = callout (sender_address);
   if (result == OK && postmaster_option_on_callout)
     result = callout (postmaster_address);
 }

I looked a bit at the code, from reading the function do_callout (file
verify.c), it seems like it tries to do the right thing (even though
in a different way than what I described here, the postmaster option
is implemented directly by do_callout and there is no callout
intermediate function, and do_callout manages the cache), but the net
result is that it does not, because do_callout is never called if
sender_address is in the cache (this happens in acl_verify in acl.c)
(I'm probably confusing DBM-cache and memory cache here, but it is
late in the night), because of this code:

else if (verify_sender_address != NULL)
{
sender_vaddr = verify_checked_sender(verify_sender_address);

  if (sender_vaddr != NULL &&                      /* Previously checked */
      (callout <= 0 ||                             /* No callout needed; OR */
       testflag(sender_vaddr, af_verify_callout))) /* Callout was specified */
    {
    /* If no callout is required on this check, but callout was specified
    on the check that is cached, we inspect the "routed" flag. If this is set,
    it means that routing worked, so this check can give OK (the saved return
    code value belongs to the callout). Othewise, use the saved return code. */

    if (callout <= 0 && testflag(sender_vaddr, af_verify_callout))
      rc = OK;
    else
      {
      rc = sender_vaddr->special_action;
      *basic_errno = sender_vaddr->basic_errno;
      }
    HDEBUG(D_acl) debug_printf("using cached sender verify result\n");
    }

This code snippet makes me think that the sender address is verified
only one time per MAIL command, and that thus if the sender
verification options depend on the recipient, there is *always* the
risk of giving the wrong answer. Verifying the sender address only one
time is essentially a (wrong) optimisation, not a feature, so it could
be disabled altogether as a quick fix. Later, if you want, you can do
expression analysis on the acl to see if it depends on the recipient
or not (the kind of analysis you already do on macro's, which are
expanded at load-time if they don't refer to $domain, $localpart, this
kind of stuff) and thus selectively enable this optimisation.

Am I making sense? If not, say so, I'll try to be more clear.

Version information:

(I checked the 4.30 changelog, it doesn't mention any related change)

Exim version 4.24 #1 built 17-Dec-2003 21:46:25
Copyright (c) University of Cambridge 2003
Berkeley DB: Sleepycat Software: Berkeley DB 3.2.9: (April 7, 2002)
Support for: iconv() IPv6 PAM GnuTLS
Authenticators: cram_md5 plaintext spa
Routers: accept dnslookup ipliteral manualroute queryprogram redirect
Transports: appendfile/maildir/mailstore/mbx autoreply lmtp pipe smtp
Fixed never_users: 0
Contains exiscan-acl patch revision 13 (c) Tom Kistner [http://duncanthrax.net/exiscan/]
Configuration file is /var/lib/exim4/config.autogenerated

Aquest missatge és part del següent fil:
	l'arbre de fils complet ordenat per data

	Alan J. Flavell en