Some days ago I posted the following to the sun-managers
list:
>
>Dear System-administrators.
>
>We have a set of workstations, that are interconnected
>by a lot of nfs-mounts. This is set up using the auto-
>mounter, and works fine in general.
>
>Our problems start when one server is down. Then, all
>systems soon become locked because they are all trying
>to auto-mount something on th server that is down (due
>to the way we have set up our system).
>
>I have tried to change some of the parameters of the
>auto.master-map, setting retries to 30, but that did
>not work.
>
>Now, I'm sure that this is a wellknown problem. Does
>anyone have the well-known solution ?
>
>Best regards to you all:
>
>Morten Krabbe Barfoed
>
Thanks to:
Christian.Masopust@emndev.siemens.co.at (Christian Masopust)
johnh@gerbil.umds.ac.uk (John Hearns)
hkatz@lehman.com (Henry Katz)
drtr@mrao.cam.ac.uk (David Robinson)
pbg@cs.brown.edu (Peter Galvin)
Bruce.Cross@tti.fingerhut.com (Bruce Cross)
raoul@MIT.EDU (Nico Garcia).
ivan@Krynn.nwac.sea06.navy.mil (John Stelling)
fetrow@biostat.washington.edu (Dave Fetrow)
gks@india.ti.com
david@srv.PacBell.COM (David St. Pierre)
Christian Masopust, John Hearns, Henry Katz and Bruce Cross
all suggest changing options to the mounter, ie. specifying
soft mounts rather than hard mounts. The SUN-manual recommends
this for filesystems that are only used for reading, ie. for
mounting man-page directories. Elsewise, when writing or exe-
cuting, hard-mounts are recommended. It is not explained in
detail why this is recommended for filesystems that are mounted
for writing.
David St. Pierre recommended using indirect maps rather than
direct maps. We are using indirect maps.
The manual, on this, further says that one should use the
'intr'-option with hard mounts. This would enable the ope-
rator to interrupt an attempt to mount on a server that is
down.
One (johnh@gerbil.umds.ac.uk) suggests that I buy the book:
"Managing NFS and NIS" from OReilly, finding it excellent,
so that piece of advice is passed on to you.
David Robinson writes, that the problem is a well-known
SunOS 4-problem (which is our OS), and Peter Baar Galvin
backs him up, stating that there is no solution using NFS/
automount.
Nico Garcia offers a way to avoid the problem when doing a
controlled halt of a machine: With his words:
"Yes: before bringing the server down, do an "exportfs -auv"
on that server. This updates the state of all the NFS expor-
ted directories and closes off their exporting.
The second trick is to kill -HUP the automount daemon and
restart it. Unfortunately, if you haven't done the exportfs
trick, it'll try to catch up it's last modifications to the
old directories before continuing."
Dave Fetrow recommends that effort is put into setting up th
system correctly: " Holding down crossmounts and having single
points of failure (for example: having /usr/local, and
/usr/spool/mail mounted from the NIS server because if ANY of
those goes you are in trouble) is nice."
gks@india.ti.com recommends using the Berkeley Automounter, I
believe that it's the one called amd.
Our effort will be the following:
First we will go try to reduce the number of crossmounts that
we are doing, and when possible do them 'soft' (but only when
theyare used for reading only).
Second we will, when bringing down a server, use the advice
from Garcie (See above).
Third, I have retrieved amd, and when I have time I'll be looking
into that.
Finally, we are going to switch to Solaris 2 one day (not one of
these days !). The problem does not exist in that OS.
Best regards to you all:
Morten Krabbe Barfoed
Danish Space Research Institute phone: +45 42 88 22 77 (switch-board)
Gl. Lundtoftevej 7 phone: +45 45 87 40 77 - 161 (direct)
DK 2800 Lyngby FAX: +45 45 93 02 83
Denmark TELEX: 37 198
e-mail: morten@dsri.dk
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:09:16 CDT