SUMMARY: Odd msgs from Solaris 2.1 machine

From: Dave Shevett (shevett@sparc.bighouse.unipress.com)
Date: Thu May 13 1993 - 10:54:45 CDT


DING DING DING! We have a winner!

The original post, or a facsimile thereof:

>We had a network problem this weekend that caused a file server
>HD to toss it's cookies. In the resulting reconfiguration, two of
>our Solaris 2.1 machines starting generating this error ad nauseum,
>and I can't make 'em stop.
>
>May 11 18:18:56 jupiter statd[125]: udp cannot send clnt_create host osf1
>May 11 18:18:56 jupiter statd[125]: udp cannot send due to out of cache
>May 11 18:18:56 jupiter statd[125]: statd: cannot talk to statd at osf1, error
-1
>
>osf1 is a - well, you guessed it - an osf/1 machine. Older one, but
>still running OSF1. At one point, I had several drives from the
>osf1 machine mounted on jupiter. They are now out of the vfstab, and
>I don't know where else to look. The message keeps repeating about
>every 20 seconds, and I can't make it stop. Rebooting jupiter doens't
>help, rebooting osf1 doesn't help. Turning osf1 off and rebooting
>jupiter doesn't help. (btw - this is happening on 2 different machines,
>both running solaris 2.1.

Well, color me purple, but the problem is *identical* a problem that
shows up on SunOS 4.x machines. A machine goes off line, but there
is still a reference to it... Several people mentioned this:

From: reynolds@icgmfg.mke.ab.com (Michael D. Reynolds)
> Try looking for entries/files in /etc/sm or /etc/sm/bak and removing
> them and rebooting. Here is the section of the man page for sm.
>
> DESCRIPTION
> /etc/sm and /etc/sm.bak are directories generated by
> in.statd. Each entry in /etc/sm represents the name of the
> machine to be monitored by the in.statd daemon. Each entry
> in /etc/sm.bak represents the name of the machine to be
> notified by the in.statd daemon upon its recovery.
>
> /etc/sm.state is a file generated by rpc.statd to record the
> its version number. This version number is incremented each
> time a crash or recovery takes place.

Bzzt. Correct definition of problem, but wrong version of the OS. The
sm directory on Solaris 2.1 machines is...

From: Casper Dik <casper@fwi.uva.nl>
> Look in /var/statmon/sm.bak
> Their will be a file their named ``osf1''.
> this file tells your machine that osf1 is holding locks and that
> it should ask for them.

The only thing left over is to kill statd and restart it, after finding
it in the oh-so-clever location of /usr/lib/nfs/statd. Once the file
is deleted, and statd restarted, no more messages. *hooray*

Thanks for those who've answered. Both machines are happy little
campers now, and both users are smiling joyously (okay, one of them
is me...)

Those who responded: (as of 5/13/93 - 11:57am EST)
------------------------------------------------------
stromberg@hydra.acs.uci.edu
reynolds@icgmfg.mke.ab.com (Michael D. Reynolds)
Casper Dik <casper@fwi.uva.nl>
ems@ccrl.nj.nec.com (Ed Strong)

---------------------.--------------------.--------------------------------
Dave Shevett | Unipress Software | "The shortest distance between
Systems Jockey | Edison, NJ USA | two puns is a straight line..."
shevett@unipress.com | (908)287-2100 x921 | - Doc Webster



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:07:50 CDT