Summary: what might be the reason?

From: judyh@nikko.com
Date: Fri Aug 15 1997 - 14:08:09 CDT


My original post was:

> I have a Sparc-20 running Solaris 2.5.1, it was running fine.
>
> This morning when I came in, it is unreachable, the console is black,
>
> but the lights of box and monitor are on. I cannot ping it, and I cannot
>
> use STOP-A. The only thing I can do is reboot.
>
> What might be the problem?

Thanks to the following experts:

        Mark Hargrave hargrme@wisdom.maf.nasa.gov
        "D. Stew McLeod" <stewart.mcleod@boeing.com>
        Scott McDermott <scottm@kcls.org>
        "Rick von Richter" <rickv@mwh.com>
        Hans Schaechl <hans@mpim-bonn.mpg.de>
        Timothy L Patton <tlpatton@uncc.edu>
        Jason Marshall <jasonm@vsl.com>
        gibian@stars1.hanscom.af.mil (Marc S. Gibian)
        Ray Brownrigg <ray@isor.vuw.ac.nz>
        

The answers are:

* Remember that just because the screen is blank after STOP-A does not
necessarily mean that the STOP-A has not worked. I often find that
typing "boot" blind at the console after a STOP-A will actually work,
preserving the logging records to help tell what went wrong.

* I see exactly the symptoms you describe very randomly and rarely on SPARCstation
20s running Solaris 2.5 and 2.5.1. I have no clue what is wrong, and the only
solution is a power cycle since there is no other way to get the system to
respond. It also leaves behind no clues as to what happened, so it would seem a
very tight low-level lockup would be the only explination.

* Swap out the memory. Any time I get this problem, it's always memory
related.

* First make sure you have crash dumps enabled.. see /etc/rc2.d/S20sysetup
Then, the next time this happens, unplug the keyboard for a few seconds then
plug it back in. Hopefully, the screen will come back and you will be at the OK
prompt. Type in 'sync' and this will force a dump of memory. You can then
analyze the dump file and see what went wrong. Even if the screen doesn't come
back, you should hear a beep when you plug the keyboard back in. You still need
to type in 'sync' even if the screen doesn't come back.

* I'd guess that someone disconnected the keyboard and plugged it back in
(which will make the machine beep and drop to a STOP-A prompt). I had
that happen to me once, and didn't think to try just typing 'go', because
when I hit STOP-A, I got no response.

* I had the same problem this week on a SPARC 20 and the FE found
a bad graphics card.



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:00 CDT