Re: Solaris 2.5 NFS apparent bug - ideas for workaround?

Top Page
Delete this message
Reply to this message
Author: Greg A. Woods
Date:  
To: Philip Hazel
CC: exim-users
Subject: Re: Solaris 2.5 NFS apparent bug - ideas for workaround?
[ On Wed, February 12, 1997 at 10:11:38 (+0000), Philip Hazel wrote: ]
> Subject: Re: Solaris 2.5 NFS apparent bug - ideas for workaround?
>
> I have been silly. Exim used to proceed as follows:
>
>   open mailbox    
>   create lock file 
>   lock open mailbox with fcntl  

>
> I don't know why I did it this way; I think I had "open the file before 
> locking" stuck in my brain. Anyway, I have now changed it so that it 
> does                                                                     

>
> create lock file
> open mailbox
> lock open mailbox with fcntl


It may be the open is necessary for those very systems where NFS is most
likely to be used if indeed you ever fail to do lock-file-locking. In
smail the OS config decides whether lock-file-locking will be attempted.

I think what you're suggesting is always calling lock_file_by_name() and
then opening the file, and finally doing kernel locking if the OS
supports it and always using fcntl()(or lockf() since it used fcntl()?)
where possible.

>From the smail source:


#ifdef    lock_fd_wait
/*
 * under 4.3BSD and some 4.2+ operating systems, flock(2) is used
 * to lock mailboxes or other mail files.  FLOCK_MAILBOX should
 * be defined in os.h in this case.  We assume that the file
 * descriptor points to the beginning of the file.
 */


[[ this comment lies slightly since lock_fd_wait() may use lockf() ]]

. . .

    if (! flock_mailbox)
    return lock_file_by_name(fn);
 . . .


#else
/*
* if the lock_fd_wait macro is not available then we must always
* use the V6-style file-locking.
*/

. . .

    return lock_file_by_name(fn);


So, smail doesn't use lock-file-locking on mailboxes if the OS config
defines lock_fd_wait() and FLOCK_MAILBOX, which is currently true for
aux2.0, bsd4.3, bsd4.4, linux, mips-bsd4.3, next2.0, and scs4.2; and any
others which include these as bases.

So, on 4.3BSD and related systems, it's important to use flock(2), which
of course requires that a file descriptor be available. Of course on
slightly more modern systems, such as SunOS-4, with NFS, using flock(2)
would guarantee instant brain damage for remote resources, so one could
hope fcntl()/lockf() locking is in use instead on those systems.

I just realize too that the last sentence in that first comment above
(about assuming the fd is at the beginning of the file) may not be true
for a file opened with O_APPEND. Unfortunately the SunOS-4.1 manuals
don't mention the behaviour of lockf(3) in this case, though direct use
of fcntl(2) would avoid the issue. Naturally just to befuddle me most
of the smail OS configs use lockf() to define lock_fd_wait(). I guess
it depends on how lockf(3) is implemented and if it uses lseek() to get
the current offset, how lseek() behaves on files opened with O_APPEND.
[I'm hoping the comment in open(2) that "the seek pointer is set to the
end of the file prior to each write" implies it is reset to where it was
after each write too.]

I may try to fix this in smail too, I suppose. I wonder if anyone ever
does delivery to home directories via separate mail delivery hosts, and
if so if any of those systems use a combination of different mailers....

Avoiding the silly kernel locking has *always* been the correct thing to
do, and unfortunately many BSD mail-handling tools didn't always do
this. Traditional file locking is just as easy to implement correctly
in the application and has as you've discovered always been more
reliable.

Of course if not delivering over NFS is possible then that's an even
better option! ;-)

-- 
                            Greg A. Woods


+1 416 443-1734            VE3TCP            robohack!woods
Planix, Inc. <woods@???>; Secrets of the Weird <woods@???>