[Exim] Listar, Exim and file locking

Top Page
Delete this message
Reply to this message
Author: Ron Brogden
Date:  
To: exim-users
Subject: [Exim] Listar, Exim and file locking
Howdy. I have been having a long running issue with a large mailing list I
am looking after and the problem seems to boil down to a file locking issue
between Listar (the MLM) and Exim (the MTA obviously). The scenario is as
follows:

A message is sent to the list by the maintainer (a 50,000-60,000 member
list). Exim receives the message and then spawns a PIPE to pass the message
to Listar. Listar starts transmitting messages on the list but at a certain
point stalls, apparently waiting for a response from Exim. Exim show that
the spool file is locked and continues this way until the maximum execution
time is reached at which point it kills off the stalled Listar process. Here
is what the logs show (IPs and email addresses changed only):

listar.log:

[06/29/2001-19:33:48] [65557] IO IN : 451 Cannot check
<obscured@obscured> at this time - please try later
[06/29/2001-19:33:48] [65557] Response: 451 Cannot check
<obscured@obscured> at this time - please try later
[06/29/2001-19:33:48] [65557] Receipient 'obscured@obscured' rejected!
[06/29/2001-19:33:48] [65557] Setvar: 'smtp-last-error'='451 Cannot check
<berni' at level 8ory.org> at this time - please try later
[06/29/2001-19:33:48] [65557] Setvar: 'bounce-error'='451 Cannot check
<bernie@g' at level 8.org> at this time - please try later
[06/29/2001-19:33:48] [65557] Setvar:
'bounce-address'='obscured@obscured' at level 8
[06/29/2001-19:33:48] [65557] Setvar: 'hooktype'='LOCAL-BOUNCE' at level 8
[06/29/2001-19:33:48] [65557] Running a hook type 'LOCAL-BOUNCE'
[06/29/2001-19:33:48] [65557] list_directory: /home/listar/lists/newsletter
[06/29/2001-19:33:48] [65557] Bounce: newsletter: <obscured@obscured> (local
bounce detected) processed.

The above is the last entry for this process.

Here's what Exim has to say about it:

2001-06-29 18:26:42 15G9Wz-000H3K-00 <= someuser@somewhere H=(someuser)
[192.168.0.1] P=smtp S=3951 id=01C100C9.08212160.someuser@somewhere
2001-06-29 18:33:23 15G9Wz-000H3K-00 Spool file is locked (another process is
handling this message)
[...]
2001-06-30 18:23:03 15G9Wz-000H3K-00 Spool file is locked (another process is
handling this message)
2001-06-30 18:26:42 15G9Wz-000H3K-00 ** |/home/listar/listar -s newsletter
<listaddress@listserver> D=system_aliases T=address_pipe: pipe delivery
process timed out
2001-06-30 18:26:42 15GW0Y-0001tt-00 <= <> R=15G9Wz-000H3K-00 U=root P=local
S=4801
2001-06-30 18:26:42 15G9Wz-000H3K-00 Error message sent to
someuser@somewhere
2001-06-30 18:26:43 15G9Wz-000H3K-00 Completed

It looks like the bounce handling is causing some weirdness with multiple
processes messing with the same message ID though I am not too clear on what
internally is going on with Exim.

The version of Exim is "Exim version 3.22 #2" and we are running FreeBSD 4.2.

Now this does not always happen - a successful run was performed a few days
previously but that was the first complete run in months. We've upgraded
Listar, Exim and the operating system to no avail (not just for this issue
but everything is reasonably up to date).

So, does anyone have any ideas what is happening exactly and more
importantly, suggestions on how to fix it? If not, any suggestions for low
impact MLMs other than Listar that play nice with FreeBSD and Exim?

Any help greatly appreciated.

Cheers,

Ron

--

-----------------------------------------------------------------------------
Island Net AMT Solutions Group Inc.          Telephone:          250 383-0096
1412 Quadra Street                           Toll Free:        1 800 331-3055
Victoria, B.C.                               Fax:                250 383-6698
V8W 2L1                                      E-Mail:    support@???
Canada                                       WWW:   http://www.islandnet.com/
-----------------------------------------------------------------------------