SUMMARY nfs caching

From: Daryl Crandall (daryl@dash.mitre.org)
Date: Mon Jul 23 1990 - 10:08:19 CDT


Here is a summary of the replies to my question about the infrequent nfs
cacheing bug. I've restated the original question and shortened the replies.

The situation is apparently widespread but infrequent. Since i have not been
able to recreate the situation at will, I haven't been able to test any of
the suggested fixes or work arounds. The problem exists under SunOS-4.0.3
but may have been fixed in 4.1.

I hope this information helps someone. This summary should satisfiy the
numerous requests for feedback.

        Daryl Crandall
        The MITRE Corporation
        daryl@mitre.org
        (703) 883-7278

############################################################################
ORIGINAL QUESTION:

We seem to be having infrequent manifestations of what appear to be NFS
cache problems. A user on one client edits a file, closes the file then
after five minutes checks the file from another client. The second client
doesn't register the changes, but the disk server has registered the changes.
After much time (~15 minutes) the 2nd client finally sees the changes!

We are running 4.0.3, client #1 is a sun3/60, client #2 is sun4/260, the
disk server is a sun4/280, the file is the users home directory tree and
the home directory is mounted on all clients, and servers.

Client 1 and client 2 receive / and /usr service from different servers, but
/home is from the server for client 1.

I thought the NFS cache problem was solved in 4.0.3?

Is there a workaround, or someway to force the 2nd client to re-read
the updated file from the disk server.

        Daryl Crandall
        The MITRE Corporation
        daryl@mitre.org
############################################################################
        You can play with the nfs mount options for that /home. The actimeo
(sp?) is the timeout on cached nfs file handles.

        luck, benny
Return-Path: <yih%atom@cs.utah.edu>
############################################################################
>>A user on one client edits a file, closes the file then
>>after five minutes checks the file from another client. The second client
>>doesn't register the changes, but the disk server has registered the changes.
>>After much time (~15 minutes) the 2nd client finally sees the changes!

try to get a copy of NFS patch tape #2. This will substantially reduce, but
not eliminate this problem. We have a beta copy of the final patch, but
have never been told that we can distribute it or that it is what will finally
be incorporated into 4.1 (or 4.1a).

>>I thought the NFS cache problem was solved in 4.0.3?

dream on. Operating systems may come and go, but NFS problems go on forever.

From: bit!markm (Mark Morrissey)
############################################################################
We have seen similar things, but so rarely that I have not found it
worth while to investigate more thoroughly. One way to force a
re-read seems to be to `touch' the file from the client that sees the
old file. The funny thing is that if you check the date you will find
that the file does not get the time of the touch, but the time of the
update from the first client. If you do another touch, then the time
will be that of the touch.

Configuration: Sun-4/390 + Sun-3 and sparcstation clients. 4.0.3.

From: Leif Andersson <leif@control.lth.se>
############################################################################
We have seen similar effects with stale files. Occasionally we will get a
"stale file handle" message when the file is accessed, but we also observe
the situation you describe. Please forward to me any solutions/discussions
you may receive/ or post them to the list.

From: frl@phebos.aps.anl.gov (Frank Lenkszus)
############################################################################
I've had the same problem too and never solved it. If you hear of
a good solution, could you pass it on? Thanks.

From: kirk@zabriskie.berkeley.edu (Kirk Thege)
############################################################################
I saw the same (very perplexing) problem before upgrading to 4.1.
Sun suggested reducing the actimeo values in fstab to short values,
so I now have /etc/fstab mounts that look like:

cellar:/export/admin /es/admin nfs actimeo=0,intr,bg 0 0
toolbox:/usr/cube /usr/cube nfs actimeo=0,intr,bg,ro 0 0
radar:/usr/spool/news /var/spool/news nfs actimeo=0,intr,bg 0 0

This probably helped; however, I seem to recall that the REAL culprit
was selection_svc ("Don't ask!"--it'll probably turn into a feature :^).
Killing and restarting that did help. If need be, I can rummage and
try to find the pertinent msgs (Sun-Manglers, cupla munts ago).

From: vasey@mcc.com (Ron Vasey)
############################################################################
One way (possibly) is to set the time out flag (actimeo) to 1 in
fstab. E.g.

gauss:/u1 /u1 nfs rw,hard,bg,actimeo=1,intr 0 0

This might not elimintate the problem but might make it less of one.

From: "S. Holmes [Consulting Detective]" <sjh@math.purdue.edu>
############################################################################



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:05:58 CDT