RE: [Exim] Exim 4 planning

Top Page
Delete this message
Reply to this message
Author: John Horne
Date:  
To: exim-users
Subject: RE: [Exim] Exim 4 planning
On 07-Jun-2001 at 11:10:38 Phil Chambers wrote:
> I wanted to have another look at Philip's Exim 4 planning documents but
> found that the document he published on 16 Jan at
>
> ftp://ftp.csx.cam.ac.uk/pub/software/email/exim/Exim4Plan.html
>
> no longer exist. On going to www.exim.org I was unable to find any links
> relating to the Exim 4 planning.
>
> Can anyone point me to the latest state of Philip's planning?
>

Attached is a text copy of the Jan 15/16 document. Whether it is the
'latest' or not we will probably have to wait for Philip's return :-)

Regards,

John.

------------------------------------------------------------------------
John Horne, University of Plymouth, UK           Tel: +44 (0)1752 233914
E-mail: jhorne@???
PGP key available from public key servers

Planning for Exim 4

Author: Philip Hazel
Last Updated: 15-Jan-2001

This is the second `edition' of this document, revised in the light of much
useful comment on the mailing list and privately. New and changed text
appears in green, like this paragraph. Some of the more radical suggestions
from the discussion have not been adopted. Were I starting again from
scratch, I might adopt them, but my intention is to upgrade Exim, not
rewrite it. I also want to avoid too much complexity.

------------------------------------------------------------------------

There comes a time in the life of every program when it needs a good spring
clean. Exim is now over 5 years old, and it has acquired far more facilities
than I ever imagined when I started it. Inevitably, this has led to rough
corners and duplication, both in the facilities themselves, and internally
in the code. At the same time, there are outstanding requests for new
features which cannot easily be done within the current implementation.

Some tidying up was done at the transition from the 2.12 release to release
3.00, but it was mainly in one particular area. I think the time has now
come to consider all the possibilities for tidying up and reformulating the
way Exim works. I am calling the result Exim 4. With luck, if I get this
right, Exim will then be able to go on for another five years without any
big incompatible changes.

What follows is a discussion document that lays out all the possibilities
that I have identified. Some are big and radical; some are small and minor,
and might even be accommodated in a compatible way. I would like people to
read and comment on all of them. The suggestions are not cast in concrete
and might well change as a result of comments. Tell me what you think.

Should I embark on this major re-working or not? The feedback from the
mailing list has been unanimously supportive, so it looks like some of this
at least will go ahead.

I estimate that it will take at least 6 months, probably more, to do all the
code changes and re-write the documentation, depending on how much time I
can give it. However, I think the result would be a simpler, tidier Exim,
both in the code and in the documentation. (But the book I am currently
trying to finish would be partially obsolete.) Further 3.2x releases will be
for maintenance only, or very minor enhancements.

Upgrading to the proposed 4.0 release will require changes of configuration
file, but I will provide a Perl script for converting 3.x configurations
into 4.x configurations, just as I did for the transition from 2.x to 3.0.
However, this time it might be more difficult to write something that copes
with every possibility, and there are likely to be cases where the best it
can do will be to warn you to check certain options.

What follows is not an exhaustive list, but I think it covers the big issues
and the major points. I have not worked through the implementation of these
proposals. There are likely to be changes of detail when I actually get down
into the code, but not, I hope, major changes of direction.

Amalgamating directors and routers

One big proposed change is the abolition of the distinction between
directors and routers. They started out very different, but have grown more
and more similar. There is a lot of duplication in the code, and it also
seems quite hard to explain the distinction to new Exim users.

Getting rid of the differences would make explaining things easier and the
code would shrink. Also, it would provide a `smartuser router' and a
`queryprogram director' which are things that have been requested.

What I propose is to abolish the word `director', and just call all of them
`routers', unless anybody can come up with a better word. So from here on,
when I say `router', I mean one of these new things.

So, how would it work? Instead of two configuration sections, there would be
just one list of `routers' in the configuration file, and the default might
look like this:

    remote_domains:
      driver = lookuphost
      domains = ! my.local.domain
      transport = remote_smtp
      no_more


    system_aliases:
      ...


... followed by what are currently called `directors'. [Actually, in the
light of other proposals later in this document, it wouldn't be quite like
this. But one thing at a time. I give a complete default configuration
below.]

So far, this is easy enough, but there are some quite deep implications. In
effect, removing the distinction between directors and routers means that
the concept of `local domains' disappears. The existing Exim uses the
local_domains list not only for determining whether to use directors or
routers, but also for one or two other things. However, some of these are no
longer really necessary and I think all these references can all be removed
and replaced in various ways.

So, do we still need a local_domains option at all? I think we need
something, but not as screwed down as the current option. In some
configurations, lists of domains may get repeated several times. Macros can
be used for this, but they can't easily be negated as a whole, and in any
case, macros are just syntactic objects. If several domain lists contain the
same macro, the tests are run for each one.

I have an idea for a new, more general feature. I propose a new option
called domain_list, which takes as its value a name and a string. So, for
example, you could say

    domain_list xyz_domains = a.b.c : d.e.f : ...


This defines a `private' domain list with the name xyz_domains. Such lists
can be referred to in other domain lists by giving their name, preceded by a
+ sign. So, you can re-invent local_domains using this mechanism by setting

    domain_list local_domains = what.ever.domains


and then the example router above could contain

    domains = ! +local_domains


instead of quoting the domain. The initial + character is specified so that
these items are syntactically different from a domain name.

While routing an address, the results of tests of the indirect domain lists
are cached when they are first done, so multiple tests of the same list are
efficient. By default, up to 32 `private' domain lists can be defined (so
that a 32-bit word can hold the cached values), but I will make this limit a
build-time configuration setting.

New Generic Options

For the new routers, some options that were previously only in some
directors become generic, and there's one new one as well:

   * An option called lowercase_local_part. This applies to local parts that
     the router handles and replaces the existing locally_caseless option,
     which applies to all local addresses. An option per router is more
     flexible. The forcing of lower case applies to the local part that is
     used by Exim in its processing. When a message is actually delivered
     with an envelope, the original casing is retained in the envelope. This
     means that the default for the new option can be TRUE because the local
     part isn't (normally) used in routers that handle remote addresses
     (where local parts must be handled casefully).


   * check_localuser (from the existing forwardfile director) and
     match_home_directory (from forwardfile and localuser) become generic
     options that can be applied to any router. In discussion it was pointed
     out that match_home_directory could actually be abolished, because its
     function can now be done by an appropriate setting of the condition
     option. In the interests of simplification, I now propose to do this.


Currently, there is a difference between the routers and directors in the
matter of retrying after temporary failures. Directors are always run in
non-queue-run deliveries; after temporary errors, they are delayed only in
queue runs. In the new world, all routers will have to be run for non-queue
deliveries to retain the same effect. This will create an extra bit of load
in the case of remote routing failures, because the routing will always be
tried for a new message. I don't actually think this is a big problem, but
if it is felt to be a bad thing, there could be a generic option along the
lines of delay_routing_in_non_queue_run which could be set on those routers
that do DNS lookups (say).

Changes to conditional running of routers

The way that various conditions are applied to routers (and directors in the
`old world') is complicated, for no particular good reason, as far as I can
see (it just happened that way). Conceptually, there are three stages (an
explicit explanation like this should be added to the manual):

   * Tests are done before running the driver (e.g. tests on the domain). If
     any of these tests fail, the driver is skipped, and the next driver is
     run.


* If the initial tests succeed, the current driver is run.

   * If the driver declines to handle the address, more tests are done to
     decide whether to run the next driver or not. At present the only test
     is on whether more is set.


The confusion is that most of the conditional options such as domains and
local_part act as `initial tests', but require_files and senders don't.
Their failure causes the driver to decline rather than skip. (It is
documented that verify_sender and verify_recipient are also like this, but
in fact the documentation is wrong.) If either of the require_files or
senders tests fail, the more option is inspected. Of course, in many cases,
more is set true, so this anomalous behaviour doesn't make any difference.
Nevertheless, I think things should be regularized.

I propose, therefore, that require_files and senders be made to act like all
the other initial conditions. This change will make all the generic options
act in the same way. The effect of no_more will be limited to situations
when the driver does actually run, but declines to handle the address.

Discussion of this issue raised some more radical suggestions. Some of them
involved paying attention to the order in which the options are given. This
would be a big change for Exim, one that I don't really want to make. I
think the proposed new state will be straightforward to explain: `These
options (domains, local_parts, condition, etc.) are conditions that are
tested first. If the conditions are not met, the router is skipped.
Otherwise, the router is run, and its result is one of accept, decline,
fail, defer, pass, or error.' Then explain what happens in each case,
including how more and unseen work.

I have brooded on more complicated changes, including some suggestions for
grouping routers, as well as on the order of conditions. The conclusion I
have reached is that anything other than the relatively simple way it
currently works becomes a `router programming language' pretty quickly.
There would be no point in anything halfway - if this were to happen, one
might as well use the knowledge we have of programming languages, and
implement something relatively complete. It might be wonderful and powerful,
BUT it would take a lot of designing and implementing, and I don't think I
want to go down that road starting from here. So I leave that idea for the
next generation MTA.

Meanwhile, it would probably help if the various returns from routers were
more clearly explained. This is the list:

   * skip: The pre-conditions for running the router were not met. The
     address is passed to the next router.


   * accept: The router accepted the address, and either queued it for a
     transport, or generated one or more `child' addresses. Processing the
     original address ceases, unless `unseen' is set on the router, in which
     case the address is passed to the next router.


   * decline: The router declined to accept the address because it did not
     recognize it at all. The address is passed to the next router, unless
     no_more is set.


   * pass: The router recognized the address, but could not not handle it
     itself. It requests that the address be passed to the next router. This
     overrides no_more.


   * fail: The router determined that the address should fail, and has
     queued it for the generation of a bounce message. There is no further
     processing of the original address, unless `unseen' is set.


   * defer: The router cannot handle the address at the present time. (For
     example, a database may be offline.) No further processing of the
     address happens in this delivery attempt. It is tried again next time.


   * error: There was some error in the router (for example, a syntax error
     in its configuration). The action is as for defer.


Revamping the old directors

Several people have noticed a lot of overlap between the various director
drivers. This has arisen as new features have been added. Also, with the
possibility of doing database lookups, names like aliasfile seem out of
place. It seems to me that all the existing director facilities can be
replicated by two new-style routers as follows:

1. A router called redirect, which replaces aliasfile, forwardfile, and the
use of smartuser without a transport. A transport must not be provided for
this router.

2. A router called accept, which replaces localuser and the use of smartuser
or aliasfile with a transport. A transport must always be set for this
router. It has no private options - all tests on the address are generic.

In other words, the two cases of (a) generating new addresses and (b)
accepting addresses for delivery are made more separate.

The redirect router provides all the facilities for generating one or more
child addresses from an original address. It would work like this:

   * If the file option is set, the contents of the file would be the list
     of new addresses or filtering instructions (like the old forwardfile).


   * If the file option is not set, the data option must be set. This is
     expanded to obtain a list of addresses or filtering instructions (like
     data in 3.20's forwardfile and new_address in the old smartuser). If
     the expanded string is empty, the driver declines.


There is no need for search_type and query or queries because such lookups
can be coded as part of data. There is also no need for expand and
include_domain, because these too can be coded as part of data. This is a
big simplification.

The owners and owngroups options are probably of some use, for example, to
check ownership of an alias file. However, having them as overall options
isn't right any more, because a file isn't always involved. Instead there
should be a new generic option that can check the ownership and the
permissions of arbitrary files.

The directory_transport and directory2_transport options could be abolished.
The (minority use) features they provide can be obtained by making use of
the fact that file_transport is expanded (it didn't used to be in early
Exims). However, in the light of comments received, I now propose just to
abolish directory2_transport, and retain directory_transport.

The current forwardfile has a messy specification for file_directory and a
home directory. However, we still want the facility of checking for a
directory before looking for a .forward file. A more transparent scheme
would be to check the directory explicitly, possibly using some variation of
require_files.

Revamping the old routers

There's a lot of overlap between the existing lookuphost and domainlist
routers. I propose to amalgamate these into a single router called
domainlist. All the existing domainlist really needs is the addition of a
few extra options that lookuphost has (such as widen_domains).

The use of the separate options route_file, search_type, route_query, and
route_queries can be cleaned up in the same way as is proposed for the new
redirect router above. They are all replaced by route_data, which is
expanded to yield a single set of routing data.

Although the facilities provided by route_list could be subsumed into
route_data, it could lead to very messy setting with nested conditional
expansion strings. Therefore, I propose to retain route_list.

Sample new routing configuration

If all the above suggestions are adopted, the default configuration for the
new routers will look something like this. I've set up a `private' domain
list called local_domains, containing just the local host name. I've assumed
a new option called check_files, which can do various tests on files and
directories. The syntax is not cast in stone; this is still at a conceptual
stage. The idea is that various tests can be specified, and the result of
failing the tests can be given (e.g. defer, decline, fail, etc.).

    domain_list local_domains = @


    remote_domains:
      driver = domainlist
      domains = ! +local_domains
      route_list = * $domain bydns
      transport = remote_smtp
      no_more


    system_aliases:
      driver = redirect
      data = ${lookup{$local_part}lsearch{/etc/aliases}}
      file_transport = address_file
      pipe_transport = address_pipe


    userforward:
      driver = redirect
      check_ancestor
      check_localuser
      check_files = exists/defer $home : \
                    owners=($local_part)/defer $home/.forward : \
                    modemask=022/defer $home/.forward
      file = $home/.forward
      file_transport = address_file
      pipe_transport = address_pipe
      reply_transport = address_reply
      no_verify
      no_expn


    localuser:
      driver = accept
      check_localuser
      transport = local_delivery


One more new router

When an address is processed as an alias or by forwarding, one or more
`child' addresses, each with a new envelope recipient, are created. Each is
then routed independently. In the old domainlist router, there is the
ability to change the name of the domain that is being routed, and carry on
with the next router. This implements `Route this domain as if it were that
domain', and has uses in certain environments. The point is that it does not
change the envelope recipient.

It seems to me that this approach may sometimes be what is wanted for
`one-to-one' aliases, that is, aliases such as

    Full.Name:  login.name


Also, on hosts where login names are case-sensitive, the initial router that
sorts out the casing by using a lookup does a similar thing.

I would like to take that functionality out of domainlist because it is a
different kind of activity to actually setting up a host list, etc. However,
the (new) redirect router isn't the place for it, because it allows multiple
children, pipes, files, etc. I propose, therefore, to invent a new router
that I am calling the masquerade router. It has a single option:

    route_address = a.new@ddress


If the option expands successfully, the address that is being used for
routing is changed, but the actual envelope recipient address is not. The
new address is routed independently. If the expansion is forced to fail, the
router declines.

This looks like the existing smartuser without a transport setting, but it
is different because it does not change the envelope recipient address.

Saving looked up data with an address

Some sites keep a number of data fields for local users in a file or
database. It is most efficient if only a single lookup is done, and the
various fields are picked out of the result, as needed. This happens at
present if the same lookup is specified several times, because Exim caches
the results of lookups. However, to make this work you have to specify an
identical lookup each time, which is tedious. I propose the following
feature to make this easier:

There is a new generic option for routers, called address_data. Its value is
a string that is expanded when an address `reaches' the router, that is,
once the decision is taken to run the router (on the basis of domain,
condition, etc.) but before any of the other options are inspected. If the
expansion is forced to fail, the router declines (and more is inspected).
Other expansion failures are configuration errors, and the address is
deferred. The result of a successful expansion is saved with the address,
and can be accessed by the variable $address_data.

For example, consider a site where /etc/passwd is not in use. Information
about users is kept in a file like this:

    jack: uid=1234 gid=3333 home=/home1/jack
    jill: uid=4321 gid=3333 home=/home2/jill


A router for matching local users is

    local_users:
      driver = accept
      address_data = ${lookup{$local_part}lsearch{/the/file}{$value}fail}
      transport = local_delivery


The fail setting ensures that the router declines unless some data can be
looked up for the local part. This particular example doesn't gain much for
the router, because the data is used only once. However, the transport can
also use the same data, without having to specify another lookup, like this:

      local_delivery:
        driver = appendfile
        file = /var/mail/$local_part
        user = ${extract{uid}{$address_data}}
        group = ${extract{gid}{$address_data}}


The equivalent configuration in `old style' would do more lookups if the
message had several local recipients, because the caching for all but the
last recipient would be lost before transport time.

In discussion, it was suggested that arbitrary variables be implemented,
instead of just $address_data. This is one of the areas where I decided to
leave this simple feature rather than add a lot of internal and syntactic
complexity.

Another discussion of this proposal pointed out that there are times when
you might want the address_data string expanded before checking the various
pre-conditions, because you want to use its value in several of them. This
could be made automatic: if any of the pre-condition variables contained a
reference to $address_data, it could be expanded first. This is very much a
kludge which wouldn't work if, for example, a looked up string containing
$address_data was expanded, but it would work for most practical cases.
However, I feel that implementing this is likely to cause confusion, so I do
not propose to do it.

Otherwise, in the absence of a fully-blown `programming language' for
routers, there would have to be two variables, one looked up early and one
late. I do not like this complication, which again could be confusing.
Although it is tedious to repeat a lookup in several conditions, the fact
that they are tested one after the other does mean that the caching probably
saves lookup repeats in this situation.

Consequential change to queryprogram

The existing queryprogram router has a facility for setting a data string
that can subsequently be accessed by $route_option. No other router can set
this variable. The data is taken from the string returned by the program
that queryprogram runs. I propose to abolish $route_option, and instead
allow the queryprogram router to add the data to the new $address_data
variable just described.

Transport changes

Currently, local deliveries are run in subprocesses, but remote deliveries
are run in subprocesses only if remote_max_parallel is set. One feature that
people have asked for is fallback transports. This can't be done at present,
because Exim has given up its privilege before running remote transports.

A simplification would be to run all remote deliveries in subprocesses, even
if not in parallel. As well as simplifying the code, this would allow Exim
to retain privilege in the main process, thereby allowing it to run a local
transport as a fallback for a remote transport. If this is done, user and
group can become generic options for all transports.

In the smtp transport, the hosts_randomize option applies only to hosts
defined in the transport. It should be made also to apply to host lists
supplied by a router. This would mean that many addresses with the same host
list could be sent to the transport at once, and then the list would get
randomized. When randomizing is done in the router, addresses end up with
different host lists, and so are not batched up.

The arrangements for controlling batch deliveries in local transports are
untidy, with both a batch and a bsmtp option, dating from the early days.
Even when batch is set, Exim does some checking of its own before sending
multiple addresses to a local transport. For example, if the transport's
options contain any reference to $local_part, only one address is ever sent
at a time, and if there is any reference to $domain, only addresses with the
same domain are sent. Given this, the batch option is almost redundant -
batching could be allowed to happen automatically, according to the
transport's configuration.

However, there may be cases where other data is used; in particular,
references to the new $address_data variable described in the previous
section. I propose to replace batch with a new option called batch_id which
imposes an additional condition on batching. It is an expanded string. For
each address it is expanded, and addresses must end up with the same string
in order to be batched. This allows the admin to apply arbitrary conditions
to batching.

There is also the existing batch_max option, which could be set to 1 to
suppress batching altogether.

What about batch SMTP? All we need is a boolean option to specify that
delivery is to be in SMTP format. The effect of the various settings of the
existing bsmtp option can be reproduced using batch_id and batch_max if
required (but may often fall out naturally). I propose to create use_bsmtp
to turn on SMTP delivery format. Currently, there is an option called
bsmtp_helo to turn on the insertion of a HELO line. I think we can do
without this - the prefix option (which needs to be changed for batch SMTP
anyway) could be used to supply a HELO line.

Is the require_lockfile option of appendfile, which can be unset to make it
try to create a lock file but not complain if it can't, actually useful?
This goes back a long way; it doesn't seem to me to be of much use.

Changes to security features

When I started writing Exim, I didn't know very much about Unix security. I
had picked up the idea that using seteuid() is bad, but nobody every really
explained to me why this was so. Nevertheless, I implemented local
deliveries in separate processes that used setuid() because of this worry.
However, Exim does make some use of seteuid(). People had said to me `Why
not give up privilege when you don't need it for a while? What can you
lose?'

Well, at long last I have finally understood why the use of seteuid() is a
bad thing, and exactly what there is to lose. The problem is not in using
seteuid() within the program itself; it is with other programs. When Exim
gives up root privilege temporarily by calling seteuid(exim), it is
vulnerable to manipulation by other processes running as exim. Such a
process could perhaps modify Exim's memory so that when it uses seteuid()
again to regain root, it misbehaves. What this means is that a cracker who
has broken into the exim account could, in theory, under certain conditions,
use Exim as a stepping-stone to root. (In discussion, I was told that this
behaviour does not in fact happen in all operating systems.)

Because of this theoretical possibility, and also because the security
option is in any case complicated and messy, I propose to redo the security
aspects of Exim so that it uses only setuid() for changing privilege. The
only need for a security option will be to set `unprivileged'. Abandoning
the use of seteuid() will require Exim to do a bit more processing while
handling some local message routing, but I do not think this will be a
problem.

Policy rejection features

The very early releases of Exim had no policy rejection features, though
they did have verification. As various features have been added, this area
has got very complicated, and I think some simplification is in order.

It is now generally accepted that the best way to reject messages from
remote hosts is to reject RCPT commands, because some MTAs keep on retrying
after error responses to MAIL or after the data. Rejecting RCPT also allows
exceptions to be made (e.g. postmaster). The original 2-stage rejection
scheme (which became a 3-stage scheme) is baroque. I propose to abolish it,
and implement all rejections at RCPT time, apart from those that require the
message to be read so that its headers can be scanned. Rejections because of
bad header content have to happen after the data phase.

This change means that we no longer need both sender_reject and
sender_reject_recipients. I propose to abolish the name
sender_reject_recipients, but change the action of sender_reject to reject
at RCPT time.

What about host_reject and host_reject_recipients? The same change should, I
think, be made. However, is there a requirement to be able to specify an
error response at host connection time? If there is, I think a more clearly
named option such as host_connect_reject should be used.

These changes are relatively small: in the next but one section below, I
make an even more radical proposal. It starts with these thoughts:

The various options for controlling relaying are complicated, and their
interactions are not clear. A consolidated way of specifying multiple
options is needed, and this idea was suggested on the mailing list:

    host_accept_relay = [AUTH,TLS]* : localhost : ![RBL]*


This example means `If authenticated and using TLS, accept relay from any
host; otherwise accept relay from local host; otherwise accept from any host
that is not on an RBL list'. (Not a real example!)

More detail is in fact needed for RBL checking, because people want to do
different things for different RBL lists. Something like

    [RBL=dul.maps.vix.com]


is probably needed. In fact, the whole area of RBL processing needs
revamping.

RBL processing

The original simple RBL check has proved too simple for many people, and it
has already been extended by the addition of various options. Further
flexibility has been requested. People want to:

   * Do different RBL processing for different sets of hosts. (At present
     there are only two sets - those that get RBL processing and those that
     don't.) For example, a smarthost might want to reject messages from any
     of its clients that are blacklisted, but only put warnings into
     incoming messages from other hosts that are blacklisted.


   * Do different RBL processing in relaying and non-relaying cases. For
     example, a smarthost might want to ban any blacklisted client hosts
     from relaying, while still accepting mail for local delivery from them
     (perhaps with a warning). This means different processing for different
     RCPT addresses.


   * Do different RBL processing for different recipients. (Some users might
     request rejection of any mail from blacklisted hosts; others may
     request the opposite.)


   * Combine the RBL tests with other tests, as mentioned above. For
     example, to say `if authenticated, ignore any RBL listing'.


* Tailor the contents of the X-RBL-Warning: line.

* Log the value returned from an RBL lookup in the reject log.

Consolidation of policy rejection features

As a result of thinking about the points mentioned in the last two sections,
I have come up with a radical proposal for an entirely new way of handling
all the policy controls. It goes like this:

At present, Exim runs checks on a host when it connects. More flexibility
could be achieved by delaying the check until the time of a RCPT command.
The results of any check not involving the recipient would of course be
remembered to avoid unnecessary repetition.

The current long list of policy control options is confusing, and it isn't
always clear in what order the checks are run and how they interact with
each other. The radical approach is to abolish all the existing options, and
replace them with a single means of specifing whether to accept or reject a
recipient.

The following options would be abolished by this proposal:
host_accept_relay, host_auth_accept_relay host_reject.
host_reject_recipients, prohibition_message, rbl_reject_recipients,
rbl_domains, rbl_hosts, rbl_warn_header, receiver_verify,
receiver_verify_addresses, receiver_verify_hosts, receiver_verify_senders,
recipients_reject_except, recipients_reject_except_senders, relay_domains,
relay_domains_include_local_mx, relay_match_host_or_sender,
sender_address_relay, sender_address_relay_hosts, sender_reject,
sender_reject_recipients, sender_verify, sender_verify_batch,
sender_verify_hosts, sender_verify_reject, tls_host_accept_relay.

Big simplification!

The original proposal was for an option called accept_recipient, whose value
was a boolean expression. However, after discussion, it is now proposed that
an Access Control List (ACL) should be used, because it is easier to explain
and implement, and there are fewer syntactic problems in adding it to the
existing configuration.

The new proposal is for a new section of the configuration file to hold
access control statements. In fact, once we started thinking about this, it
became clear that having more than one ACL would be helpful. I have now
realized that there are things other than recipients that might usefully be
controlled by an ACL. For example, currently smtp_etrn_hosts is a host list;
it controls purely by host identity. However, someone may want to implement
other tests, such as `is the host authenticated?', so it might be useful to
be able to apply an ACL to smtp_etrn_hosts and other similar options. For
this reason, the proposal that follows has an additional layer of
indirection.

The (new) second section of the configuration file contains any number of
Access Control Lists. Each list begins with a name that is terminated by a
colon. The lines that follow, up to the next name line or the `end' line,
comprise the named list. For example:

    local_acl:
      accept recipient = +local_domains : verify : verify_sender


    relay_acl:
      accept recipient = +relay_domains
      accept authenticated


The colons in individual ACLs are interpreted as `and'. The ACLs are used by
being referred to from options in other parts of the configuration. For
incoming messages, the accept_recipient option lists the ACLs to apply to
the arguments of RCPT commands. For example,

    accept_recipient = local_acl : relay_acl


The colons in these lists are interpreted as `or'. In this example, the
recipient is accepted if either of the two ACLs accepts it. ACLs can also be
referred to from other ACLs. (There will a check to prevent looping.)

There are two different kinds of line that can appear in an ACL. The first
type consists of a command, followed by a list of conditions. When such a
line is inspected, it can either grant or deny access, or pass, in which
case Exim inspects the next ACL line. If the end of the list is reached,
access is denied. The commands that behave like this are:

   * accept: access is granted if the conditions are met; otherwise it
     passes.


* deny: access is denied if the conditions are met; otherwise it passes.

   * pass: this command passes if the conditions are met; otherwise it
     denies access. This is a bit of syntactic convenience to make it easier
     to write some conditions that would otherwise have to use deny with
     negative conditions.


   * warn: this command always passes, but if the conditions are met, a
     warning header line is added to an incoming message. This header line
     is present in all copies of the message that are delivered. It is not
     possible to have different warnings for different recipients.


The other kind of line that may appear in an ACL list consists of one of the
words deny_message or warn_message followed by arbitrary text. The first
sets up a message to be used in the event that a subsequent command denies
access. For example:

    deny_message  Your host has spammed us too much
    deny          host = 10.9.8.7


Each deny_message text supersedes the previous one, and is expanded before
use. The default text is `administrative prohibition'. This feature replaces
the prohibition_message option and the $prohibition_reason variable in the
existing Exim.

The warn_message line sets up the text to be added to the incoming message
as a new header line if the warning triggers. For example:

    warn_message  X-RBL-Warning: Sending host is DUL-listed
    warn          blacklisted = dul.maps.vix.com


Again, there is just one message. I'm not sure if there should be a default
for this, because warnings can now be given for all kinds of reasons. When
converting configuration files automatically, I'll try to insert a setting
that has the same effect as before.

The conditions in an ACL all start with a keyword, and some are followed by
data items. The following are proposed:

   * recipient = address list to check the recipient address when receiving
     a message.


   * sender = address list to check the sender address when receving a
     message.


* host = host list to check the identity of a remote client host.

* blacklisted = rbl parameters to do an RBL check on a remote host.

* host_authenticated to insist on authentication.

* tls to insist on a TLS connection.

* verify_recipient to verify the recipient.

* verify_sender to verify the sender.

* batch is true during batch SMTP input.

* acl = list of ACL names; this is a `subroutine' facility for ACLs.

There is probably also a case for introducing a general condition condition
that can be used to build custom tests (`only accept from this host during
offpeak hours'). This option would specify a string that is expanded; it
could test any of the available variables, just like condition in the
routers. This could be used to check the details of TLS authentication, for
example.

If no conditions are given for a command, it acts as if the condition is
true. Thus, for example,

    deny


denies access in all circumstances. This list of conditions covers the
existing facilities, I think. It will be easy enough to add new conditions
later if they are wanted.

Here's how the default configuration might use this scheme. The main part of
the configuration is:

    domain_list local_domains = @
    accept_recipient = local_acl


The ACL section of the configuration is:

    local_acl:
      allow recipient = +local_domains : verify_recipient : verify_sender


To help people set up for relaying, the supplied file should probably also
contain some empty, possibly commented out, settings for the standard
`incoming' and `outgoing' relay requirements.

I'm now going to work through some additional examples to give a feel for
how ACLs for controlling incoming recipients will look. I'm going to use a
single ACL list to do the entire job. Suppose we want to add incoming
relaying to a specific set of domains, and not to verify relayed addresses
(but still verify the sender). The relay domains are a.b.c and the contents
of /etc/relay/domains (a dbm file).

    accept recipient = +local_domains : verify_recipient : verify_sender
    accept recipient = a.b.c :: dbm;/etc/relay/domains : verify_sender


Notice the need to double the colon in the address list. Alternatively, the
separator could be redefined. This particular ACL could be written in other
ways. For example:

    pass   verify_sender
    accept recipient = a.b.c :: dbm;/etc/relay/domains
    deny   recipient = ! +local_domains
    accept receiver_verify


The advantage of listing the conditions explicitly is that it enables you to
control the order in which they operate and how they interact with each
other. This is not possible in the current Exim, where these things are
hard-wired into the code. The new scheme also allows you to have different
error messages for each case. For example:

    deny_message  Sender must verify
    pass          verify_sender
    accept        recipient = a.b.c :: dbm;/etc/relay/domains
    deny_message  Relaying prohibited from $host
    deny          recipient = ! +local_domains
    accept        recpient_verify
    deny_message  Local address doesn't verify


The last deny_message applies to the implicit deny at the end of the list.

There needs to be replacements for the existing local_domains_include_mx and
relay_domains_include_local_mx options. This can be handled by adding new
features to domain lists meaning `any domain whose lowest MX points to the
local host' and `any domain with an MX pointing to the local host',
respectively. We already have @ meaning `the name of the local host', so
perhaps @mx_top and @mx_any might be a suitable syntax for this.

To add a block for certain hosts to the basic example:

    deny   host = 10.9.8.7 : 192.168.4.5
    accept recipient = +local_domains : verify_recipient : verify_sender
    accept recipient = a.b.c :: dbm;/etc/relay/domains : verify_sender


To add a block for certain senders, but make postmaster@??? an
exception to all blocks:

    accept recipient = postmaster@???
    deny   host = 10.9.8.7 : 192.168.4.5
    deny   sender = bad@???
    accept recipient = +local_domains : verify_recipient : verify_sender
    accept recipient = a.b.c :: dbm;/etc/relay/domains : verify_sender


Let's add in relaying from any authenticated host, allow outgoing relaying
from a local LAN, and reject from an RBL list, to give something approaching
what a server might actually have:

    accept recipient = postmaster@???
    deny   host = 10.9.8.7 : 192.168.4.5
    deny   sender = bad@???
    deny   blacklisted dul.maps.vix.com
    pass   verify_sender
    accept recipient = +local_domains : verify_recipient
    accept recipient = a.b.c :: dbm;/etc/relay/domains
    accept host_authenticated
    accept host = 192.168.23.0/24


This example shows up a gotcha that it is easy to fall into. The ACL is not
correct, because it allows an authenticated host to send to a local domain,
without the recipient having verified. To fix this, it is necessary to add
this extra line before the final two lines:

    deny   recipient = +local_domains


The warn command can provide the existing RBL warning facilities, but so far
there is nothing that can do what sender_try_verify does. In the existing
Exim, this does a sender verify and accepts or denies if the verification
concludes. However, if verification cannot be completed (for example,
there's a DNS timeout), it accepts instead of denying.

Before inventing stuff to re-implement this, I feel I should ask: Does
anybody actually use sender_try_verify? If you do, do you feel that it is a
sufficiently important feature to retain? If it is wanted, then I think we
need another condition called verify_try_sender, which has the required
behaviour.

This is how the wish items above could be handled:

   * A smarthost might want to reject messages from any of its clients that
     are blacklisted, but just put warnings into incoming messages from
     other hosts that are blacklisted.


         deny   host = 192.168.47.0/24 : blacklisted = rbl.maps.vix.com
         warn   blacklisted = rbl.maps.vix.com
         accept recipient = +local_domains : +relay_domains


   * A smarthost might want to ban any blacklisted client hosts from
     relaying, while still accepting mail for local delivery from them.


         accept  recipient = +local_domains
         deny    host = ! 192.168.4.0/24
         deny    blacklisted = rbl.maps.vix.com
         accept


* Do different RBL processing for different recipients.

         pass    recpient = +local_domains
         deny    recipient = user1@* : blacklisted = rbl.maps.vix.com
         deny    recipient = user2@* : blacklisted = dul.maps.vix.com
         accept


The recipients could of course be checked using a lookup. If you have a few
`standard' ACLs that different sets of users could request, you could use
something like this:

    pass    recipient = +local_domains
    deny    recipient = dbm;/etc/type1-users : acl = type1_acl
    deny    recipient = dbm;/etc/type2-users : acl = type2_acl


* To say `if authenticated, ignore any RBL listing'.

         accept  authenticated
         deny    blacklisted = rbl.maps.vix.com
         accept


There is a problem with the existing sender_verify_fixup option. Under the
new scheme, a bad sender always causes rejection at RCPT time, before the
message's header has been read. Thus, it is not possible to check for a good
sender in the header lines before rejecting. (In the current Exim, a bad
sender doesn't - the first time - cause a rejection until the message's data
has been read.) There are two possibilities:

   * Just abolish sender_verify_fixup. How many sites actually make serious
     use of this facility?


   * If sender_verify_fixup is set, don't reject RCPT just because of a
     failing sender verification; instead, try to do the fixup as now, after
     the data has been read, and if it fails, reject after the end of the
     data. This, however, rejects the entire message, and so would not allow
     for exceptions (e.g. always accept mail for postmaster). For this
     reason, I favour abolishing sender_verify_fixup.


Opinion during discussion also seems to be in favour of abolition.

Other uses of access control lists

I mentioned smtp_etrn_hosts as a candidate for replacement by an ACL above.
The new option will be called smtp_etrn_acl, and would be a list of ACLs,
one of which must accept. Other similar options are:

* smtp_expn_hosts becomes smtp_etrn_acl.

   * smtp_vrfy (currently a boolean) is replaced by smtp_vrfy_acl, to
     specify which hosts may use VRFY.


Question: Are these generalizations actually going to be useful in practice?
I am tempted not to make these changes (which could be done later by a
compatible addition of the new options) unless they are genuinely wanted.

Different kinds of input

Another suggestion that was made was to differentiate between different
kinds of message input, and possibly apply different rules to them. The
different input sources are:

* SMTP from a remote host on address xxx port yyy.

* Batch SMTP.

* SMTP on standard input from another process.

* Message on standard input.

The suggestion was that one might have `injectors' or `acceptors' that
applied to the different cases. Each one could specify a different ACL for
accepting recipients (not sure about the last one), and could also specify
other parameters concerning incoming messages, such as whether unqualified
addresses are acceptable, and rewriting rules.

My feeling on this is that it is getting too complicated to graft into the
existing design, so I do not propose to take this further.

Ident checks in host lists

Any item in a host list may start with ident@ to invoke an RFC 1413 ident
check. This was an early idea for use among a cluster of hosts to allow one
MTA to test for a message coming from the MTA on another host, as opposed to
a user program. It could also be used in settings like

    smtp_etrn_hosts = exim@192.168.5.4/24


However, I rather suspect that in practice, this feature has never been
used. I propose to remove it, in order to simplify host lists. If such
checks are wanted, they can be added as new conditions to ACL lists.

Use of dynamically loaded libraries

People want Exim to use dynamically loaded modules for a variety of reasons.
When I started to create Exim, I never expected anything other than source
distribution; the RPMs and inclusions in OS distributions caught me by
surprise. I know very little about the mechanics of dynamic loading, but I'm
aware that not all operating systems support it. I'm also aware that not all
people support it!

My current position is that there is enough to do for Exim 4 without this
added complication, and I that should concentrate on the other major
changes. Dynamic loading will remain on the Wish List, and be considered
later.

Message arrival rate checking

There was a request for the ability to reject messages if more than so many
arrived in one second from any one host. I don't think this would be
difficult to implement, but it is a dangerous facility. If your host comes
back up after being down for a while, a host that normally sends you lots of
mail may well have a backlog that it sends as quickly as it can. For this
reason, I am not very keen on this.

Full mailboxes

Currently, Exim doesn't discover that a local mailbox is full until it tries
to deliver a message to it. There have been requests for a way of
discovering this information earlier, so that a rejection can be given to a
RCPT command. There needs to be a choice between a 4xx and a 5xx rejection.

Because of the way Exim works, there are a lot of problems with this:

   * It could only work for Exim quotas because there's no standard way of
     checking system quotas, as far as I know.


   * When quota_is_inclusive is set on the transport, the check could only
     be done if the client has sent the size of the message on the MAIL
     command.


   * Exim quotas are a per transport thing. The only way to check would be
     to run the start of an appendfile transport, to do the check.
     Unfortunately, when Exim is receiving a message, it is running as exim,
     not as root, so it couldn't run the transport under the correct
     uid/gid. That means it couldn't find the size of the existing mailbox.


The only way I can think of for doing this would be to provide a command
line option that gives a yes/no answer to the question `Is this local part
over quota?', and to get the receiving Exim to run another Exim in another
process (so that it gets back its privilege) to do the test. This seems to
me to be far too contorted and over-complicated, so I do not propose to do
anything at this time.

Local code for message scanning

There has been some discussion on the list about ways of incorporating local
code into Exim to handle message scanning. Marc Haber listed some basic
principles:

* The scan should be done only once per message.

* A clean message should be passed through the MTA only once.

   * The scan should be done by a child process of Exim to solve locking
     problems.


   * There should be documented interfaces in Exim for obtaining all the
     information needed, and taking the necessary actions.


   * Possible actions are: bounce the message, send a message to sender
     and/or recipients, throw away the message, deliver the message to a
     specified address, or modify the message before delivering it.


Some of this can already be done by running embedded Perl from a system
filter. However, although it does not in fact use a child process, this is
expensive in terms of overhead.

The most efficient way to do this external message checking would be to
define an API to a C function, with a specific set of functions it could
call back to take certain actions. Exim would come with a `do nothing'
version of the function which would be used by default. This approach gives
the possibility of writing C code for speed, or just using the C function to
call Perl, or even to fork and call an entirely separate program.

I can see two possible ways that something like this could be done. One is
much simpler than the other.

A simple extension to the system filter

Before running a system filter, the external C function is called, and the
result is made available in a variable that can be tested in the filter. The
external function is passed the sender and recipient addresses, and an open
file descriptor to the body. It can call Exim's string expander to obtain
the value of any Exim variable or header line.

This approach just adds more conditional functionality to the filter. The
actions that can be taken as a result do not change. In particular, it is
not possible to modify the message directly (though of course it can be
delivered to some program that modifies and then re-injects it).

A much more complicated filtering scheme

Once a message has been received, setting up a scheme for modifying it is
not straightforward. I am not convinced about the need to modify messages
`inside' an MTA, but I have thought about the problem.

One possibility could be an input filter. This would pass the incoming
message, byte by byte, through an external function. The incoming message
that Exim receives is what the external function passes on to it. That would
allow early modification, before the message even gets to Exim's spool
files. How could such a scheme work? Here is a first attempt at an API:

   * At the start of receiving a message, emc_newmessage() is called. This
     can perform any initialization necessary.


   * For SMTP input, the sender and recipient addresses are passed
     individually to emc_sender() and emc_recipient(). These functions can
     return permanent or temporary rejections.


   * When the DATA command is received, Exim calls emc_data(), to obtain a
     return code.


   * Exim reads each byte of the message by calling emc_getc(), which gets
     its data from the existing input source (a function to call would be
     passed).


   * The code in emc_getc() must interpret the message itself. It can, for
     example, buffer up all the header lines before passing anything on to
     Exim. It can also modify the message as it passes.


   * At the end of the message, when Exim has written it to its spool, but
     has not yet acknowledged it to the sender, Exim calls emc_endmessage().
     This can accept or reject the message (temporarily or permanently). A
     callback function could be provided so that it could add or remove
     recipients at this time.


This does seem like quite a lot of complication. Not only for Exim, but also
for the external filter, which would have to duplicate quite a lot of what
Exim does (e.g. pick out header lines) if it wanted to do serious message
analysis.

My own preference is to go for the simpler scheme, and in the discussion so
far people have agreed with me.

Deliveries from a system filter

Deliveries from a system filter that are not `unseen' override the normal
deliveries for the message. However, such deliveries are processed like
alias and forwarding addresses. In particular, they are not remembered if
they fail to deliver at the first attempt. This means that at a subsequent
attempt, the original recipients are reinstated.

This doesn't make any difference if the system filter sets up its deliveries
unconditionally, because they get set up again the next time, and again
override the original recipients. However, if the system filter tests
first_delivery, the behaviour can vary. For example, this filter does spam
checks for the first delivery attempt, but not for any subsequent ones (on
the grounds that it's done the check already):

    if first_delivery then
      if <some spam checks> then
        deliver spam.commissar@???
        finish
      endif
    endif


However, if the delivery to spam.commissar@??? has a temporary
failure, it won't be retried because first_delivery will be false at the
next delivery attempt.

The documentation isn't clear on this, and obviously it can be improved, but
should anything else be done? When aliasing or forwarding, the one_time
option can be set, to cause the new addresses to replace the original in the
`top-level' list of recipients. Something similar could be done here.
Suppose, for example, the filter contained

    one_time deliver spam.commissar@???


This would cause the new address to replace the original recipients, and so
be retried if it failed on the first attempt. I think one_time and unseen
should be mutually-exclusive settings.

Exim could, of course, automatically take this one-time action for all
non-unseen deliveries in a system filter. However, making it optional gives
more flexibility; there may be situations where it isn't wanted. In
discussion, it was suggested that one_time should be the default. Note,
however, that it cannot apply to pipes, files, or autoreplies because they
cannot be remembered as `top-level' addresses. For this reason I'd rather
not make it the default, for consistency.

Pipes run at filter time

People want to make use of the result of pipes at filter time. This could
probably be implemented fairly straightforwardly by a new condition, but the
problem is that people would want to test the result several times, and that
would really require there to be a variable to store it in. I avoided
variables in the filter language; I wanted to keep it very simple, both to
explain and to implement. I reckoned people could always use procmail if
they wanted more. So I'm not keen to do this, but I'll continue to think
about it.

Configuration file

There have in the past been several requests for `include' facilities in the
configuration file. I've resisted on the grounds that this means reading
more files each time an Exim process starts up. However, people tell me that
file caching is very efficient, and anyway, nobody is forced to use the
facility. So I'm prepared to think about this one. A problem is what syntax
to use. For maximum flexibility, an `include' line should be allowed
anywhere, so the syntax must, if possible, not conflict with anything that
might be part of the configuration. The only really safe thing is to steal a
`comment' line. Would it be safe to interpret lines of the form

    #include /some/file/name


as `include' lines? Maybe two ## at the start would be safer? Any better
ideas? The ownership and mode of an included file will be checked in exactly
the same way as the main configuration file.

There has been a request that the `end' lines be extended so you can say,
for example,

    end transports


The idea is that this gives you a warm feeling that you have got the
sections correct. I'm not very keen, on the grounds that it doesn't add very
much, and you can always write

    end
    #End transports


if you want to. Views?

There was another request for the abolition of the need to use backslash for
continuing configuration lines. The problem is, what syntax could replace
it?

Changes to default runtime configuration

The runtime configuration will, of course, have to be modified to cope with
all the syntactic changes that are made. In terms of functionality, I
propose to turn on sender and receiver verification by default.

Proposed abolitions

There are a number of features that I'd like to abolish, unless lots of
people object. Some have been around for a long time and I'm not sure how
useful they are; others have been superseded in various ways.

   * never_users is a paranoid trigger guard to avoid running local
     deliveries as root. I'm not sure that it actually achieves much, and it
     is just an added complication.


Discussion of this proposal showed that people like to have never_users, so
I will leave it.

   * nobody_user and nobody_group are used only in the queryprogram router
     if no values are given. This is real minority stuff, and I think they
     should go.


   * The {key:subkey} facility for single-key lookups has the same effect as
     using an extract item on the looked up value. The colon used to delimit
     the key is theoretical problem, because colons can appear in keys that
     are used for lookups other than lsearch. For these reasons, I'd like to
     abolish this feature.


   * freeze_missing_include was added to disable a bit of early paranoia. I
     now think it is reasonable to abolish it, and always to defer delivery
     if an include file is missing. (See further discussion of freezing
     below.)


   * I think that nowadays it is not a good idea to encourage the use of
     `host literal' addresses such as user@???. I therefore suggest
     abolishing local_domains_include_host_literals. If someone wants to
     support host literals, they can always list the literal explicitly in
     local_domains. However, in discussion it was pointed out that this
     makes it difficult to set up configurations that work on more than one
     host. A compromise is to invent some magic syntax to match @, which
     represents the host name. I suggest @ipliteral.


   * We really only need one option instead of both dns_check_names and
     dns_check_names_pattern. An unset pattern could be used for `no check'.


   * The $precedence variable is very old, and pre-dates the ability to look
     at the contents of any header. I propose to abolish it, now that the
     same value is available from $h_precedence:.


   * The errors_address option has `decayed' over time. It seems that it is
     now used only for freeze_tell_mailmaster messages and for bounces
     caused by -Mg applied to a bounce message. In the latter case, I think
     we can just discard the bounce, and treat -Mg as -Mrm. For freezing, I
     propose a new option called freeze_tell, which contains a list of
     addresses to be mailed when a message freezes, but as now, this won't
     include frozen failed bounces.


   * The errors_copy option was an early facility for taking copies of
     locally generated bounce messages. I do not know how much it is used,
     but the feature can now be implemented using additional routers with
     unseen set, so I originally proposed to abolish it. However, several
     people objected, and pointed out that it was a useful simple feature
     for beginners. So it's reprieved.


   * local_domains_include_host can be abolished, because you can now
     include @ in local_domains to refer to the local host.


   * Having proposed the abolition of the director option
     directory2_transport above, it seems sensible also to get rid of
     message_filter_directory2_transport as well. The
     message_filter_file_transport option is not currently expanded, but it
     could easily be, thereby retaining the ability to switch transports
     according to the form of the path name.


   * message_size_limit_count_recipients was a `bright idea' that doesn't
     really seem to be very useful, so I propose to abolish it.


   * The X rewriting flag is very special-purpose. Putting it in the general
     rewriting rules was a fudge. Now that rewriting rules can be handled
     from option settings with the addition of rewrite_headers to
     transports, it would be better to put these special rewriting rules in
     a new option of their own instead. It could be called
     sender_verify_rewrite, meaning `apply these rules and check again when
     verifying a sender'.


   * If the radical reform of policy checking that was described above does
     not go ahead, leaving sender_verify, receiver_verify,
     sender_verify_hosts and receiver_verify_hosts still in existence, I
     propose to abolish the boolean options sender_verify and
     receiver_verify. The effect can be achieved by setting the host lists
     appropriately. However, it now looks as if the ACL proposal will
     happen.


* collapse_source_routes has done nothing for some time, so should go.

   * A number of options using the word service are synonyms for options
     using port (e.g. daemon_smtp_service). This usage came from Smail (and
     I guess refers to /etc/services), but everybody talks about `ports'
     nowadays, so I think we can lose the old names.


   * timestamps_utc can be abolished - the equivalent effect can be obtained
     by setting


         timezone = utc


   * The appendfile and pipe transports have an obsolete from_hack option
     which was superseded by check_string and escape_string some time ago.


Autoconf

Somebody once tried to autoconf Exim, but found it too big a job. I now have
some experience with using autoconf for PCRE, and I think maybe some use
could be made of it. I don't, however, believe that all Exim build-time
configuration should be done that way. The reason is that, unlike something
like PCRE, there is quite a lot of information that is `user choice'. Giving
it all as options to a configure command does not seem the best way of doing
things.

Whenever I build something that needs more than a couple of obvious options
to configure, I always save them in a file anyway, so I know what I did for
next time. Therefore, I think it is sensible to retain the current Local
file structure for all the user choice configuration.

However, it might be helpful to use autoconf to dig out various bits of
information about the operating system. At present, the OS/Makefile-* files
have hard-wired settings, and maybe this information could be figured out by
running autoconf, which would save having to keep maintaining these files.

I would arrange things so that configure is run automatically the first time
that make is run, but it would be possible to run it manually first, to
override defaults. (For example, if you have both cc and gcc installed on
your system, as I do, you need to be able to specify which to use.) I will
need to do some experiments to see exactly how this would work.

Inter-process communication

When I started, I tried to use only the most basic Unix functions, in order
to make Exim as portable as possible. However, operating systems keep moving
on, and perhaps it is now possible to be a bit more adventurous. I need to
do some investigation to see what features are available on the various OS
that Exim supports, but I suspect that shared memory, mmap(), and semaphores
are widespread and could be useful.

Semaphores could be used to implement serialize_hosts and
smtp_etrn_serialize without having to use a hints file. Not only is this
likely to be more efficient, it avoids the problem of data being left around
after a system reboot. Semaphores might also be useful for implementing a
maximum number of outgoing SMTP calls over all Exim processes.

mmap() could be used for faster access to files of configuration data,
including lsearch files. I plan to do some experiments on this.

Message freezing

I was very paranoid at the start, and specified freezing whenever things
were slightly awry. Over time, a number of cases that used to freeze have
been changed to defer instead. (See also the comment about
freeze_missing_include above.) Freezing applies to the whole message, not
just one recipient. So if one recipient has junk in a .forward file that
provokes freezing, it might hold up delivery for other recipients that don't
get delivered at the first attempt.

I propose, therefore, to remove the freezing from cases such as syntax
errors for :include: items and other parsing errors. In these cases, trying
again later won't do any harm. Automatic freezing will remain for problems
that are more serious than just mistakes made by humans in configuration
files (for example, sub-process crashes).

Panics

As in the case of freezing, there are a number of places where Exim panics
because I was being paranoid. I propose to try to reduce the number of panic
crashes to an absolute minimum. This won't affect the panic log, which will
still be used for incidents that are considered serious enough to be brought
to the admin's attention.

Log level and log options

The simple `log level' approach is too crude. As a result, a number of
separate `log this' and `log that' options have been created. It would be
tidier to abolish all this apparatus, and instead go for a bit-pattern to
control optional logging. This would give a lot more flexibility. You could
say something like

    log_selector = 10246


where the value is an octal number. The manual would list the bits and what
they mean. Names could be defined for common bits and sets of bits, with
addition and subtraction facilities, so settings like this could be used:

    log_selector = +incoming_port+rewrites
    log_selector = +all-subject


Debugging

The simple `debug level' approach is also far too crude. It would be better
to use a bit-pattern to select debugging of various types. For example, you
might want to turn on DNS resolver debugging but not anything else. At
present you have to turn on everything else to get this. As for logging,
names could be provided for commonly required bits or sets of bits. I
propose to go through the code and classify the debugging output more
carefully, e.g. debugging routers, debugging transports, debugging filters,
and so on, and make them independently switchable. You would then be able to
run commands like this:

    exim -d3040 ....
    exim -d+transports


to get very specific debugging information. The -dm option for debugging
memory could be abolished, with its function assigned to one of the new
debugging bits.

Debugging when running under inetd is not easy because the only way is to
recompile Exim to specify where to write the debugging output. This ancient
mechanism should be replaced by a runtime option to specify an output file.

Hints databases

There was a request for the dns_again_means_nonexist option not to be
instantaneous, but to operate only after the DNS has been giving `try again'
for some time. This requires inter-process communication, and could perhaps
be implemented using shared memory. Alternatively, a new hints database
could be created, but that seems a lot of mechanism for just this very small
item.

It occurs to me that perhaps what I should implement is a `general' hints
database for keeping odd little bits of information like this all in one DBM
file, instead of having a lot of different databases. It probably makes
sense to keep the main retry and waiting-for-host databases separate, but
everything else could be amalgamated.

Miscellaneous

   * smtp_verify should be replaced by smtp_vrfy_hosts, so that it works in
     the same way as smtp_expn_hosts.


   * A relic from very early days is the existence of patch space at the end
     of the default values for the configuration file name, Exim's path,
     log_file_path, pid_file_path, received_header_text, smtp_banner, and
     spool_directory. (Grep src/globals.c for `patch' if you want to see
     what I'm on about.) Does anybody actually use this facility? If not, it
     should be removed. I received one request to retain it, but nobody has
     admitted actually using it...


   * A big update like this is an opportunity to remove a number of items
     that remain for backwards compatibility, for example synonyms for
     option names.


   * The -f option for the exim_tidydb utility was invented because I was
     worried that the the checking it does would take a lot of time and so
     should not be mandatory. I've changed my mind; I think it might as well
     always do the full check.


   * ignore_errmsg_errors and ignore_errmsg_errors_after could be
     amalgamated into a single option, with a time of 0s meaning `ignore
     immediately'. The default would be a large time value, or an unsettable
     value such as -1. There has been a suggestion for an option to send the
     failing messages to the postmaster once the time has expired.


   * There is at present no default message size limit. I think there should
     be one, but perhaps this should be in the default configuration rather
     than the code. Something like 10M or 20M perhaps?


   * The default setting of smtp_connect_backlog should be raised from 5 to,
     say, 20.


   * Currently, Exim has no features to aid debugging in the event of a
     process crash. On the whole, core dumps don't happen, because it is a
     setuid process. Catching crashes is always a tricky business, because
     it is easy to destroy the evidence. However, I plan to investigate ways
     of preserving more information in the event of a crash.


   * The daemon can listen on multiple interfaces, but there is only one
     port number. This can be generalized. Adding a port to an IP address
     using a dot as a separator is something that works with both IPv4 and
     IPv6 addresses, and it also avoids problems with colon as a list
     separator:


         local_interfaces = 1.2.3.4.25 : 5.6.7.8.487 : ::::1.42


   * -oX allows the incoming port to be specified on the command line, but
     there is no means of specifying the incoming interface(s). For
     consistency, this should be fixed, and -oX could be used for both types
     of data.


   * When an address is being routed, its constituents are in $local_part
     and $domain, but there is currently no variable that contains the whole
     thing. It could be put into $recipient, but that risks confusion with
     $recipients (which is available in system filters). Maybe $address
     could be used?


   * There was a suggestion that Exim be capable of using values from the
     environment in its configuration. Something like


         ${env:NAME}


I don't like this idea for the following reasons: (1) The environment could
only be useful when submitting a message for delivery. All that happens at
this time is address rewriting. By the time delivery occurs, the environment
has long gone. I'm sure people would get very confused by this. (2) It might
also tempt admins into setting up configurations that ordinary users could
confuse by playing with the environment.

   * It has been reported that some spam programs send out a complete SMTP
     session in a single write(), without waiting for any responses. This is
     contrary to the rules of SMTP. Even when the PIPELINING extension is in
     use, there are places where the client must wait for a response before
     proceeding. Exim could check that there is no outstanding data when it
     sends such responses, and take some suitable blocking action.


   * A queue runner that does only first deliveries of messages has been
     requested for use in environments where all messages are queued when
     they arrive.


   * There has been a request for an expansion item that includes the
     contents of a file in a string. It would have two arguments: the first
     argument is the file name, and the second is a string that is used to
     replace every newline in the file. If omitted, the newlines remain. For
     example, this would allow you to pick up a host list from a file by
     using a setting such as


         hosts = ${getfile{/some/file}{:}}


Terminology

There are a few places where it might be helpful to change names, for
consistency, and to make things more understandable.

   * I've used both `receiver' and `recipient' in different places. I think
     `receiver' should be changed to `recipient' for consistency.


   * The helo_verify option should be renamed as helo_verify_hosts because
     it is a host list.


   * If the major policy checking restructuring goes ahead, the tag rbl_
     will not exist for any option (that is, no option's name will start
     with rbl_). It therefore seems sensible to rename variables as well, so
     as to remove the specific name. $rbl_domain would become
     $blacklist_domain and $rbl_text would become $blacklist_text.


   * remote_sort should be renamed remote_sort_domains because it is a
     domain list.


Internals

A major upheaval is also an opportunity to do some internal tidying up. This
is not of interest to the majority of Exim users, but it should make life
easier for me and anybody else who has to maintain Exim. I have discovered
that there is a fancy name for this: it is called `refactoring'. I won't go
into details here, but I have a list of points if anybody is interested.

Documentation

From the comments people make, it is clear that the HTML documentation is
used in preference to the other formats a lot of the time. I recently did
some work to improve it - for example, it now includes chapter and section
numbers. However, while it continues to be produced using an intermediate
Texinfo stage, there will always be constraints. I propose to write a new
Perl script to generate HTML directly from the original marked up sources,
bypassing the Texinfo step.

------------------------------------------------------------------------