[Exim] Understanding directors + deleting duplicate messages

Author: Ross Boylan
Date:
To: exim-users
Subject: [Exim] Understanding directors + deleting duplicate messages

Duplicate messages are a chronic nuisance for me. Their causes are varied,
but I would like to get rid of them. I gather this is considered a job for
procmail. I have various reasons for not wanting to use procmail (see
below), but it seems it should be easy to do in exim.

I'm a home user, getting and sending mail via dial-up. Mostly I use the
system, but my wife does sometimes also. My daughter probably will too
when she's older.

As an added wrinkle, I want mail to my wife to be duplicated to me, so I
can tell her when she's got mail. I'm doing this now with a smartuser
director.

The duplicate elimination needs to be done on a per user basis, rather than
per-message, so that multiple people can get the same message.

I'd appreciate any suggestions or comments on the following strategy:

After the smartuser director just mentioned has split the message,
Put a duplicate_kill director.
This has a condition, which is an embedded perl script.
The perl script manages a database of seen message keys (keys of the
message id in the header seem to eliminate a lot of the duplicates for me).
If this message has seen before, it returns true.
Otherwise, it writes the key into the database, and returns false. (this
is the part I can't do without perl, as far as I can tell. Exim's lookup
facilities let me check if the key exists, but do not let me write one out).
Back in the duplicate_kill director....
if the condition is true (the message is a duplicate) it uses an
appendfile transport to stuff the message in a duplicates file (just in case).
Otherwise, processing proceeds as normal.

Then I could write scripts to clean out the junk. (I might write to the
database using the indicated key and the date as a value, to make it easy
to age things).

I'm running Debian potato. exim is Debian's recommended MTA; unfortunately
the packaged binaries (reportedly) do not have embedded perl enabled.

Footnote: Why I don't want to use procmail:
1. It's another program to learn, administer, and run.
2. It appears to lack any real documentation (all I see are FAQ's, tips,
guides, but nothing laying out the command line options and the file format).
3. I have already invested time learning exim and getting its filters just
so. If I use the filters (local ones), I'm not using procmail, and
vice-versa (I think).
4. If I use procmail, or any other pipe, and try to send the results back
to exim I have to worry about setting up a special port, making sure I'm
conforming to Debian's and exim's requirements, and worry about special
tricks to prevent loops.
5. It just seems ridiculous to have something described as a MTA which
neither gets nor sends most of the mail (since fetchmail and procmail do
the work).

Speaking of documentation: It wasn't clear to me what addressing
information different macros would pull out. Mail for me typically goes
like this.
Sender -> intermediates -> RossBoylan@??? which forwards it
-> my actual account at my ISP -> fetchmail -> my local system (which
includes some address rewriting rules).

I tried it and think I figured out what was going on (namely that
intermediates didn't count). For example, I thought maybe mindspring (my
ISP) would show up as the sender of everything, since it was the
immediately preceding system in the delivery chain. But it would have been
nice to have more of an explanation.

Also, the manual is very nice as a reference, but is not so helpful for
getting started. It provides little guidance for what is essential or
important vs peripheral. Now maybe you shouldn't use exim unless you know
what you're doing, but I think I'm not the only one using it as I do
(namely a home system for which I want to use mail, not become a mail
guru). It would be nice to have something like a user's guide.