Re: [Exim] mbx locking

Top Page
Delete this message
Reply to this message
Author: Ray Miller
Date:  
To: Philip Hazel
CC: exim-users
Subject: Re: [Exim] mbx locking
On Wed Jan 26 09:18:53 2000, Philip Hazel wrote:

> When I started writing Exim I knew nothing about the details of Unix
> locking. People told me that flock() was obsolete - it doesn't work over
> NFS for a start - so I avoided using it, and used fcntl() for locking.


Yes, I think fcntl() is preferred. However, you can still wait on the
lock by using F_SETLKW rather than F_SETLK.

> What Exim actually does is to try to get the lock a number of times
> (set by lock_retries - default 10), waiting for the time
> lock_interval (default 3 seconds) between each try. At least this is
> what the code is supposed to do. Bugs are always with us...


I don't think there's a bug with this logic. A test (with -d9)
produced lots of messages:
    
    fcntl() or MBX locking failed - retrying


and a few:

    failed to lock mailbox /home/test1/Mail/INBOX (fcntl)


indicating that exim tried several times to get the lock before
giving up. (This was a simple test with one script invoking exim to
deliver to this mailbox in a tight loop, and another making an IMAP
connection, opening the mailbox, marking all messages for deletion,
and expunging, again in a tight loop.)

> So if you get a locking error, it has been trying for quite some time.
> Would a blocking lock actually help?


I think it would, because waiting for the lock is more aggressive than
sleeping. Also, I cheated and patched exim before repeating the test
described above. This time, exim did not report any errors obtaining a
lock.

Here's the patch (quick and dirty - not for production use):

--- exim-3.12/src/transports/appendfile.c.ORIG    Wed Dec  8 09:57:09 1999
+++ exim-3.12/src/transports/appendfile.c    Mon Jan 24 15:07:57 2000
@@ -1692,6 +1692,7 @@
     #ifdef SUPPORT_MBX
     else if (ob->use_mbx_lock)
       {
+      int rc;
       lock_data.l_type = F_RDLCK;
       lock_data.l_whence = lock_data.l_start = lock_data.l_len = 0;
       if (fcntl(fd, F_SETLK, &lock_data) >= 0 && fstat(fd, &statbuf) >= 0)
@@ -1730,7 +1731,13 @@


         lock_data.l_type = F_WRLCK;
         lock_data.l_whence = lock_data.l_start = lock_data.l_len = 0;
-        if (fcntl(mbx_lockfd, F_SETLK, &lock_data) >= 0) break;
+    sigalrm_seen = FALSE;
+    os_non_restarting_signal(SIGALRM, sigalrm_handler);
+    alarm(30);
+        rc = fcntl(mbx_lockfd, F_SETLKW, &lock_data);
+    alarm(0);
+    signal(SIGALRM, SIG_IGN);
+    if (rc >= 0) break;


         DEBUG(9) debug_printf("failed to lock %s: %s\n", mbx_lockname,
           strerror(errno));


--
Ray Miller <ray.miller@???>
Unix Systems Programmer
Oxford University Computing Services