SUMMARY: SS10 Locking up

From: Mark Mellman (mellman@fileserver.mpi.com)
Date: Mon Jan 17 1994 - 16:07:06 CST


Original Posting:

|Hi Managers,
|
|I have a SS10 model 512 that is running Solaris 2.2 that I am having some
|problems with (this machine is also my NIS+ server). Every morning when I
|arrive to work, I find the machine locked up. When you press return on the
|keyboard there is no response. You cannot telnet or rlogin to it. NFS hangs.
|STOP A does not even work. You have to physically turn off the machine. When
|the machine comes up from a cold boot, it works fine until the next morning.

As usual, the Sun Manager's came through in a big way!

I would like to thank the following people:

Adam Shostack <adam@bwh>

This came through a local mailing list a while back, talking about
Sol2.2 patches. DAS is David Steffens
(eplunix!das@harvard.harvard.edu)

Patch-ID# 101028-01
Keywords: hsfs upgrade hangs
: Synopsis: SunOS 5.2: system may hang due to memory leak in hsfs
Date: Jun/04/93

glenn@uniq.com.au (Glenn Satchell - Uniq Professional Services)

Do you have a CD-ROM on this system? Does it have a CD in it at night?
If so then the problem is with the entry in root's crontab that runs
nfsfind. This script also runs over the CD's filesystems and this is
what hangs the system. The fix is to eithe reject the CD at night, or
change the script so that it will not traverse the /cdrom filesystem.

  Although the documentation doesn't say so, this patch allows you to
keep the AnswerBook CD-ROM (or any other hsfs-format CD-ROM with many
symbolic links) online and mounted overnight. Without this patch, my
system wedged every night at 3AM when cron ran find down the CD-ROM
directory tree. I'm also sending a note to Sun suggesting that they
add a comment about this behavior to the patch documentation. -- DAS

Adam

mp@allegra.att.com (Mark Plotnick)

I had this problem a few weeks ago with an SS10-51 and an SS10-512,
and asked sun-managers.
I've been delinquent in summarizing. What worked was to apply
patch 101408-01, "SS10-51 or SS600-51 may hard hang or watchdog reset"
for 4.1.3, which is basically "apply patch 100725-12 and patch the
kernel to set enable_sm_wa to 1".

Why am I telling you this, since you said you run Solaris 2.2?
Well, patch 101406-01, "SS10 Model 514 reliability problems" for Solaris 2.3,
is basically "apply patch 101318-12 and edit /etc/system to
set enable_sm_wa to 1". So it seems like any model 51
should have these patches applied.



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:54 CDT