SUMMARY: Stale NFS file handles

From: rsaddler@cccis.com
Date: Thu Sep 02 1999 - 16:32:03 CDT


In short, we will be rebooting the box, as the system being mounted to
is home for many many other boxes with live connections most all the
time.

Many many thanks to the replies!!!

Notes recieved (quick summary here, in no particular order) are:

==========

You can restart nfs.client and nfs.server.

cd /etc/init.d
nfs.server stop
nfs.client stop
nfs.server start
nfs.client start

==========

I would love to know what you receive on this and thanks

==========

We ran into the same problem. Our solution (to avoid rebooting the
clients) was to go to each client having the problem and kill all
processes that had open files on the filesystem experiencing the Stale
NFS file handles and unmount the partition. Next time it was
accessed, it automounted just fine to the new partition.

==========

You can try this procedure:

1. Kill rpc.lockd, then rpc.statd on the client and server.
2. rm -r /var/statmon/sm sm.bak.
3. Start rpc.statd (client 1st, then server).
4. Start rpc.lockd.

Haven't tried this in a while, lemme know if it still works.

==========

Have you tried unmounting the file system?

==========

Rebooting is your only choice.. You have already moved the
data so your gonna be stuck until those clients reboot.

==========

You can umount the stale NFS filesys and then remount.
There is no need to reboot.

==========

Nope... I think you are out of luck.
stale nfs can not be cleared without rebooting.

==========

We've had similar stuff and have found that in 90% of cases we can

fuser -c -u /stale/mounted/filesystem
Then kill -9 all PID's
Then umount /stale/mounted/filesystem (twice sometimes (?))

Should do it for you.

==========

#! /bin/sh
#
# checkmounts
#
# Clear stale nfs mounts
#
# This script should be run periodically from crontab
#
# Unmount all the stale mounts
for m in `df -k 2>&1 | grep Stale | awk -F: '{print $1}' `
do
     echo Dismounting stale $m on `hostname`
     /usr/sbin/umount $m
done

# Remount those specified in fstab
/usr/sbin/mountall -F nfs

# End of script

==========

Yeah, we use all automounted home directories (Solaris 2.6 / 2.5.1) and it
appears to work 90% of the time. Unfortunately, the rest of the time the
only way we've found is to reboot. A lot like NT really :-)

==========

Sometimes you can umount them, believe it or not. This doesn't always
work tho.

==========

[My favorite!]

> Aside from rebooting, is there any way to clear
> stale NFS file handles? (SunOS 5.6)

No.

> We have been migrating data from some RSM
> arrays to a new disk subsystem, of which are some
> holders of automounted home directories.

When I do this, I try to:

1) Give advance warning to everyone affected, tell 'em to logout
      and not leave processes running on transition night (or weekend,
      or whatever).

2) Before taking the old file system(s) offline, do a df (or check
mnttab)
      on every client on the lan.

2.5) Make a list of every user responsible for an old-file system mount.
      They each owe me a beer. :-)

3) Go in to each offending client with lsof, and in a minimally invasive
      way clear the offending mount(s).

      Worst case, reboot the offending client.

4) *Now* take the old file system(s) offline.

==========

------------------------------------------------------------------------

Thanks go out to:

"McIntire, John" <john_mcintire@unitrode.com>
Asim Zuberi <asim@colltech.com>
Chris Graves <cmg@bow.rsc.raytheon.com>
Colin_Melville@mastercard.com
Dan Stromberg <strombrg@nis.acs.uci.edu>
Dave Plummer <dlp@medphys.ucl.ac.uk>
Jay Lessert <jayl@latticesemi.com>
Mark Hargrave <hargrme@wisdom.maf.nasa.gov>
Mark Kabella <mkabella@Legato.COM>
Michael Cunningham <malice@exit109.com>
Sandeep Patni <spatni@lucent.com>
dlp@medphys.ucl.ac.uk
don@hauraki.demon.co.uk
don@hauraki.demon.co.uk
spatni@lucent.com

------------------------------------------------------------------------

My original plea was:

Oh great hive-mind,

Aside from rebooting, is there any way to clear
stale NFS file handles? (SunOS 5.6)

We have been migrating data from some RSM
arrays to a new disk subsystem, of which are some
holders of automounted home directories.

To wit, we're getting a deluge of "No directory! Logging in with home=/"

Any good hacks^wwork-arounds?

Graci'!!

- Ray
........................................................................
Ray Saddler 312.222.4636 - x2150 rsaddler@cccis.com www.cccis.com
Network Systems / UNIX Administration - CCC Information Services, Inc.
World Trade Center Chicago 444 Merchandise Mart Chicago, IL 60654-1005



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:13:25 CDT