Re: [EXIM] Remove attachments

Top Page
Delete this message
Reply to this message
Author: Marc Haber
Date:  
To: exim-users
Subject: Re: [EXIM] Remove attachments
On Wed, 9 Jun 1999 09:32:30 +0100 (BST), you wrote:
>On Tue, 8 Jun 1999, Marc Haber wrote:
>> How about this:
>> - exim reads the spool file
>> - exim feeds the spool file's content to the filter program
>> - exim reads the filter's output back
>> - exim writes that output to a different file (that is also locked)
>> - exim deletes the old spool file
>> - exim renames the new file to the old name.
>
>That is the kind of thing I was referring to, but I need to investigate
>and do experiments to see if it behaves as expected. I have made a note
>on the Wish List.


Thanks.

>The second last operation (delete old file) should *not* be done. You
>want the file always to exist - a rename replaces one file with another
>atomically. Deleting the old one first would leave a gap when there was
>no file.


Why would I want that file to exist? Only for locking purposes? That
would be taken care of if the locking mechanism is changed.

>> Removing attachments will probably need to change some MIME headers as
>> well.
>
>That would have to be done with "headers remove".


Again, this would work for the simple example case, but not for
anything more complicated. Imagine a filter that unpacks ZIP files. In
that case, not only must the MIME-Header be _altered_ (not deleted),
it must be altered in accordance with the result of the filtering
process.

>I still feel this is getting overly-complicated. In particular, the
>headers could not be modified in this way because Exim is holding them
>in main memory during processing. I am much happier with a scheme that
>does such involved message modifications entirely outside of Exim.


This is a point that I can understand. However, it surely would be
possible to write the headers to the pipe before starting with the
message and to replace the memory-stored headers with the headers the
filter generates. But maybe I'm asking too much here since we'd have
to deal with the case if the filter would modify the headers in a way
that would affect already-taken routing or directing decisions.

>> Oh btw, the re-deliver approach mentioned in the FAQ will cause
>> messages to need two queue runs to be actually delivered if
>> queue_remote is set.
>
>Not true. The system filter is run at the start of delivery, before it
>looks at any addresses and decides whether they are remote or not.


The pipe-and-redeliver approach mentioned in the FAQ doesn't use a
system filter. However, the case that I am mentioning in fact affects
outgoing mail that is also virscanned via pipe-and-redeliver on one of
my systems:

# transport
|virscan:
| driver = pipe
| bsmtp = all
| batch_max = 32767
| bsmtp_helo = true
| command = "/usr/local/virscan/bin/scanmail $sender_host_address $message_id /var/log/exim_virscan 1"
| current_directory = "/tmp"
| from_hack = false
| freeze_exec_fail = false
| group = virscan
| ignore_status = false
| log_defer_output = false
| log_fail_output = false
| log_output = true
| prefix =
| return_output = false
| return_path_add = false
| timeout = 6h
| umask = 022
| use_shell = false
| user = virscan


#router
|vircheck:
| condition = "${if eq {$received_protocol}{scanned-ok} {0}{1}}"
| driver = domainlist
| route_list = "*"
| transport = virscan


With queue_remote_domains = *, this leads to:

|1999-06-08 12:16:29 exim 3.02 daemon started: pid=1588, no queue runs, listening for SMTP on port 25
|1999-06-08 12:17:02 10rIwI-0000Po-00 <= mh@??? H=myhost.mydomain.de [192.168.10.21] P=esmtp S=655 id=E1BA6F2016F0D21187400000E864096F042D@??? T="blubber"
|1999-06-08 12:17:02 10rIwI-0000Po-00 == marc.haber-lists@??? routing defer (-45): remote delivery skipped
|1999-06-08 12:18:20 Start queue run: pid=1850 -qq
|1999-06-08 12:18:24 10rIxc-0000UI-00 <= mh@??? U=virscan P=scanned-ok S=823 id=E1BA6F2016F0D21187400000E864096F042D@??? T="blubber"
|1999-06-08 12:18:24 10rIxc-0000UI-00 == marc.haber-lists@??? routing defer (-45): remote delivery skipped
|1999-06-08 12:18:24 10rIwI-0000Po-00 => marc.haber-lists@??? R=vircheck T=virscan
|1999-06-08 12:18:24 10rIwI-0000Po-00 Completed
|1999-06-08 12:18:24 10rIxc-0000UI-00 Spool file is locked
|1999-06-08 12:18:24 End queue run: pid=1850

As you see, pipe delivery to the virus scanner is deferred.

Now, your other mail:
>Further -- I've just seen the fatal flaw in this. It goes as follows:
>
>. Process A is playing this game. It opens the old file, locks it,
> writes the new file, and locks it.
>
>. Process B (an Exim queue runner, say) comes along, opens the old file.
>
>. Process A now renames the new file as the old file and closes the file
> descriptor for the old file. This gives up the lock on the old file.
>
>. Process B now tries to lock the open file (the old one) and
> successfully gets the lock, so goes on to try to deliver the message,
> using the wrong file.
>
>Oops.


Ouch.

>As I suspected initially, it looks as though this cannot be done without
>re-implementing the locking completely.


I suspect that taking out a lock on the header file won't work because
the headers are subject to change during the filter run. So, a
dedicated lock file (maybe <message-id>-L?) would be needed here. I
can't judge how hard it is to make this change.

Now, we have two possible solutions for the same problem:

(A) pipe-and-redeliver
This is an approach that can be used today. exim doesn't care what the
filter is doing to the message since it delivers the message to the
pipe and forgets about it. When the filter is finished, it redelivers
the message, initiating a new run of exim in a different process.

We have clean separation of delivery and filtering at the expense of a
less legilible log file, some interlocking problems and an annoying
quirk if queue_remote_domains is being used.

(B) filter
This looks like clean and beautiful approach that would faciliate
almost any processing step in a single delivery process integrated
into exim.

However, this would need substantial changes in exim's locking
mechanism and in the header processing.

Greetings
Marc

-- 
-------------------------------------- !! No courtesy copies, please !! -----
Marc Haber          |   " Questions are the         | Mailadresse im Header
Karlsruhe, Germany  |     Beginning of Wisdom "     | Fon: *49 721 966 32 15
Nordisch by Nature  | Lt. Worf, TNG "Rightful Heir" | Fax: *49 721 966 31 29


--
*** Exim information can be found at http://www.exim.org/ ***