On Monday I wrote:
>A Sparc 1+ attempts to boot, but part way through the loading
>of vmunix (the spinning wheel appears, and the 'NNNNNN+XXXXX+
>YYYYY' shows), it crashes with a "watchdog reset: instruction
>access exception". There is a clearly visible "blip" to the
>workstation monitor just as this happens.
>This is after the memory self-test has (apparently)
>passed. However, after the failed boot, running 'test-memory'
>from the boot prom results in another watchdog reset and
>the error "memory address not aligned." There is, however,
>no reference to any slot number to indicate which SIMM might
>Powering off the system and restarting it, I can interrupt the
>startup self-test with L1-A and then run 'test-memory'
>from the boot prom, with no errors shown. Attempting
>to boot results in the same failure as above, after which
>'test-memory' again reports the alignment error.
>There are two questions: (1) is the boot error shown above
>an indication of a memory problem or something else? and
>(2) if this is an indication of a bad SIMM, how to identify
>And, of course, the real question is what to do now? Thanks.
First, thanks to all who responded (and to any others whose
responses come in after I send this):
firstname.lastname@example.org (Adam Shostack)
Dale Houston <email@example.com>
trdlnk!mike@uunet.UU.NET (Michael Sullivan)
Ted Walsky <firstname.lastname@example.org>
email@example.com (Kevin Heagney)
Gary Marazita (Melbourne Aust. Eng. ) <GAM.GARY@MELPN1.Prime.COM>
Several people suggested that this might not be a memory problem at
all, but rather a corrupted kernel. While this turned out not to be
the case here, the suggestion to boot another kernel (some
suggest the miniroot from CD-ROM or tape) is a good one. This can
rule out the kernel as the problem.
Adam provided a good description of a systematic means of swapping
and testing SIMMS worth repeating:
>Binary swapping refers to yanking out 1/2 of the suspect
>components, and moving them. If the problem moves, move 1/2 the moved
>components back. This allows you to quickly narrow down the source of
In fact, I managed to find the bad SIMM using this method.
With respect to the confusing behavior of the system memory
test, Michael noted:
>My understanding (based on a call to Sun Tech Support several years
>ago) is that the SPARCstation 1's memory test doesn't work properly
>after a boot has been attempted; it has something to do with it
>expecting the caching and/or memory management unit to be in the
>initial state, rather than the configuration which the boot program and
>SunOS put it.
Kevin passed along a list of Patch ID's related to watchdog
resets, none of which are relevant to my situation, but might be
to someone else's:
100232-01 SunOS 4.1.1: Sparcstation 2 crashes or watchdog resets
100017-01 Breakpoints in kadb cause watchdog resets
100319-04 SunOS 4.1.1 Watchdog Reset in Sun4-490 FDDI->Ethernet router
Dale suggested a video problem, which considering the video 'blip'
I reported in my original post, I pursued by swapping both
frame buffer and monitor. Neither helped here, but it was something
else to try.
-- Tim Evans | E.I. du Pont de Nemours & Co. firstname.lastname@example.org | Experimental Station (302) 695-9353/7395 | P.O. Box 80357 EVANSTK AT A1 AT ESVAX | Wilmington, Delaware 19880-0357
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:07:42 CDT