OK, after a beta phase of about 4 weeks with no reports of major
blowups, here is exiscan-acl revision 16:
http://duncanthrax.net/exiscan-acl/
As announced previously, -15 was a beta version that was not advertised
on the web site. To give you an instant overview, I'll just post the
CHANGELOG and the documentation about the new MIME ACL here:
-- CHANGELOG -----------------------------------------------------
16 - Major new feature: the MIME ACL. Read all about it in
the new doc/exiscan-acl-spec.txt file.
- removed variable-length array from demime.c. The very
much advanced native SCO Unixware compiler can't
handle this. Glad noone ported this to gcc.
- Fix: only feed files called "winmail.dat" to the TNEF
decoder. It seems it likes to crash on some arbitrary
files (Which is a bug in itself, but I don't feel like
debugging the mess which is tnef.c and tnef.h. And
I don't feel like writing my own TNEF support either.).
------------------------------------------------------------------
-- MIME-ACL DOCS -------------------------------------------------
-- (excerpt from exiscan-acl-spec.txt) ---------------------------
1. The acl_smtp_mime MIME ACL
--------------------------------------------------------------
Note: if you are not familiar with exims ACL system, please go
read the documentation on it, otherwise this chapter will not
make much sense to you.
Here are the facts on acl_smtp_mime:
- It is called once for each MIME part of a message,
including multipart types, in the sequence of their
position in the message.
- It is called just before the acl_smtp_data ACL. They share
a result code (the one assed to the remote system after
DATA). When a call to acl_smtp_mime does not yield
"accept", ACL processing is aborted and the respective
result code is sent to the remote mailer. This means that
the acl_smtp_data is NOT called any more.
- It is ONLY called if the message has a MIME-Version header.
- MIME parts will NOT be dumped to disk by default, you have
to call the "decode" condition to do that (see further
below).
- For RFC822 attachments (these are messages attached to
messages, with a content-type of 'message/rfc822'),
the ACL is called again in the same manner as
for the "primary" message, only that the $mime_is_rfc822
expansion variable is set (see below). These messages
are always decoded to disk before being checked, but
the files are unlinked once the check is done.
To activate acl_smtp_mime, you need to add assign it the name
of an ACL entry in section 1 of the config file, and then
write that ACL in the ACL section, like:
/* ---------------
# -- section 1 ----
[ ... ]
acl_smtp_mime = my_mime_acl
[ ... ]
# -- acl section ----
begin acl
[ ... ]
my_mime_acl:
< ACL logic >
[ ... ]
---------------- */
The following list describes all expansion variables that are
available in the MIME ACL:
$mime_content_type
------------------
A very important variable. If the MIME part has a "Content
-Type:" header, this variable will contain its value,
lowercased, and WITHOUT any options (like "name" or
"charset", see below for these). Here are some examples of
popular MIME types, as they may appear in this variable:
text/plain
text/html
application/octet-stream
image/jpeg
audio/midi
If the MIME part has no "Content-Type:" header, this
variable is the empty string.
$mime_filename
--------------
Another important variable, possibly the most important one.
It contains a proposed filename for an attachment, if one
was found in either the "Content-Type:" or "Content
-Disposition" headers. The filename will be RFC2047
decoded, however NO additional sanity checks are done. See
instructions on "decode" further below. If no filename was
found, this variable is the empty string.
$mime_charset
-------------
Contains the charset identifier, if one was found in the
"Content-Type:" header. Examples for charset identifiers are
us-ascii
gb2312 (Chinese)
iso-8859-1
Please note that this value will NOT be normalized, so you
should do matches case-insensitively.
$mime_boundary
--------------
If the current part is a multipart (see $mime_is_multipart)
below, it SHOULD have a boundary string. It is stored in
this variable. If the current part has no boundary parameter
in the "Content-Type:" header, this variable contains the
empty string.
$mime_content_disposition
-------------------------
Contains the normalized content of the "Content
-Disposition:" header. You can expect strings like
"attachment" or "inline" here.
$mime_content_transfer_encoding
-------------------------------
Contains the normalized content of the "Content
-transfer-encoding:" header. This is a symbolic name for
an encoding type. Typical values are "base64" and "quoted
-printable".
$mime_content_id
----------------
Contains the normalized content of the "Content
-ID:" header. This is a unique ID that can be used to
reference a part from another part.
$mime_content_description
-------------------------
Contains the normalized content of the "Content
-Description:" header. It can contain a human-readable
description of the parts content. Some implementations will
repeat the filename for attachments here, but they are
usually only used for display purposes.
$mime_part_count
----------------
This is a counter that is raised for each processed MIME
part. It starts at zero for the very first part (which is
usually a multipart). The counter is per-message, so it is
reset when processing RFC822 attachments (see
$mime_is_rfc822). The counter stays set after acl_smtp_mime
is complete, so you can use it in the DATA ACL to determine
the number of MIME parts of a message. For non-MIME
messages, this variable will contain the value -1.
$mime_is_multipart
------------------
A "helper" flag that is true (1) when the current
part has the main type "multipart", for example
"multipart/alternative" or "multipart/mixed". Since
multipart entities only serve as containers for other parts,
you may not want to carry out specific actions on them.
$mime_is_rfc822
---------------
This flag is true (1) if the current part is NOT a part of
the checked message itself, but part of an attached message.
Attached message decoding is fully recursive.
$mime_decoded_filename
----------------------
This variable is only set after the "decode" condition (see
below) has been successfully run. It contains the full path
and file name of the file containing the decoded data.
The expansion variables only reflect the content of the MIME
headers for each part. To actually decode the part to disk,
you can use the "decode" condition. The general syntax is
decode = [/<PATH>/]<FILENAME>
The right hand side is expanded before use. After expansion,
the value can
- be '0' or 'false', in which case no decoding is done.
- be the string 'default'. In that case, the file will be
put in the temporary "default" directory
<spool_directory>/scan/<message_id>/
with a sequential file name, consisting of the message id
and a sequence number. The full path and name is available
in $mime_decoded_filename after decoding.
- start with a slash. If the full name is an existing
directory, it will be used as a replacement for the
"default" directory. The filename will then also be
sequentially assigned. If the name does not exist, it will
be used as the full path and file name.
- not start with a slash. It will then be used as the
filename, and the default path will be used.
You can easily decode a file with its original, proposed
filename using "decode = $mime_filename". However, you should
keep in mind that $mime_filename might contain anything. If
you place files outside of the default path, they will not be
automatically unlinked.
The MIME ACL also supports the regex= and mime_regex=
conditions. You can use those to match regular expressions
against raw and decoded MIME parts, respectively. Read the
next section for more information on these conditions.
2. Match message or MIME parts against regular expressions
--------------------------------------------------------------
The "regex" condition takes one or more regular expressions as
arguments and matches them against the full message (when
called in the DATA ACL) or a raw MIME part (when called in the
MIME ACL). The "regex" condition matches linewise, with a
maximum line length of 32k characters. That means you can't
have multiline matches with the "regex" condition.
The "mime_regex" can only be called in the MIME ACL. It
matches up to 32k of decoded content (the whole content at
once, not linewise). If the part has not been decoded with the
"decode" condition earlier in the ACL, it is decoded
automatically when "mime_regex" is executed (using default
path and filename values). If the decoded data is larger
than 32k, only the first 32k characters will be
matched.
The regular expressions are passed as a colon-separated list.
To include a literal colon, you must double it. Since the
whole right-hand side string is expanded before being used,
you must also escape dollar ($) signs with backslashes.
Here is a simple example:
/* ----------------------
deny message = contains blacklisted regex ($regex_match_string)
regex = [Mm]ortgage : URGENT BUSINESS PROPOSAL
----------------------- */
The conditions returns true if one of the regular
expressions has matched. The $regex_match_string expansion
variable is then set up and contains the matching regular
expression.
Warning: With large messages, these conditions can be fairly
CPU-intensive.
------------------------------------------------------------------
regards,
/tom