Re: [Exim] Double messages

Top Page
Delete this message
Reply to this message
Author: Marilyn Davis
Date:  
To: exim-users, clint.davis
CC: MLN Administrators
Subject: Re: [Exim] Double messages
Hi Exim Experts,

Thanks for the clues, Philip.

The exim_mainlog is still reporting the locked spool file:

2000-03-27 09:51:42 12W2Qd-0006Nx-00 Spool file is locked

But exiwhat doesn't say anything very useful for me:

[root@deliberate /tmp]# exiwhat
528 3.02 daemon: -q15m, listening on port 25

Would it be one of the [exim] processes? The log has 7 locked spool
files in the latest queue run and there are 6 [exim] processes. The
message we're studying was first sent on the 17th. These [exim]
processes are all older than that.

[root@deliberate /tmp]# ps -ef | grep exim
root       528     1  0 Mar09 ?        00:00:00 /usr/local/bin/exim -bd -q15m
root      3380     1  0 Mar14 ?        00:00:00 [exim]
majordom  3385  3380  0 Mar14 ?        00:00:00 [exim]
majordom  3387  3385  0 Mar14 ?        00:00:00 [exim]
root     19005     1  0 Mar16 ?        00:00:00 [exim]
majordom 19006 19005  0 Mar16 ?        00:00:00 [exim]
majordom 19008 19006  0 Mar16 ?        00:00:00 [exim]
root      9333   649  1 10:00 tty1     00:00:03 emacs exim_mainlog
root      9404  9336  0 10:06 ttyp0    00:00:00 grep exim


Am I doing something wrong that I have these old [exim] processes and
that some are owned by majordom?

In thinking about how it happens, does it help to know that we run a
daemon that checks every hour to see if our connection has fallen down
and restarts it if it has?

Thanks again for thinking about this. Those duplicates are
embarrassing.

Marilyn Davis, Ph.D.
eVote - online polling software for email lists
http://www.deliberate.com 
marilyn@???    
+1 650 965-7121  (USA)






On Fri, 24 Mar 2000, Philip Hazel wrote:

> On Wed, 22 Mar 2000, Marilyn Davis wrote:
>
> > This last time, I did a thorough study of the history of the duplicate
> > message and read everything in the Exim manual about spool locking and
> > I'm at a loss to figure out what to do.
>
> [snip]
>
> > 2000-03-17 11:28:59 12W2Qd-0006Nx-00 <= owner-mln-chat@???
> U=majordom P=local S=3247
> id=Pine.LNX.4.10.10003171125340.672-100000@???
>
> [snip]
>
> > Now, apparently this process has given up and never again tries any
> > deliveries but has left the lock on the spool file:
> >
> > 2000-03-17 11:45:49 Start queue run: pid=24659
> > 2000-03-17 11:45:49 12W2Qd-0006Nx-00 Spool file is locked
>
> I have seen stuck processes before, but usually when an *incoming*
> TCP/IP call got dropped, not an outgoing one. It seems to be a problem
> in the TCP/IP stack such that a system call fails to time out. A way to
> get out of this situation is to use exiwhat to find out which process is
> working on the message, and kill that process. Then the message is no
> longer locked, and the next queue run will pick it up again. It *should*
> be proof against duplicates.
>
> > Here we recycled the modem, the computer stayed up, and who generated
> > this? You can tell by the id that it is the same message.
> >
> > 2000-03-18 16:52:26 12WTxC-0000Fn-00 <= owner-mln-chat@???
> U=majordom P=local S=3247
> id=Pine.LNX.4.10.10003171125340.672-100000@???
>
> By the Pine message id it's the same message, but it looks from the log
> that majordom resubmitted it to Exim. Exim has given it a new Exim id.
> So the problem is why did Majordomo resubmit the message over 24 hours
> later? I think this is unrelated to the stuck delivery.
>
> -- 
> Philip Hazel            University of Cambridge Computing Service,
> ph10@???      Cambridge, England. Phone: +44 1223 334714.

>
>
>
> --
> ## List details at http://www.exim.org/mailman/listinfo/exim-users Exim details at http://www.exim.org/ ##
>