One of our Sparc 4/65 crashed today with the following errors:
DVMA Parity Error, ctx = 0x0, virt addr = 0xff0782d0
pme = e3002372, phys addr = 23722d0
Parity Error Register 94<ERROR,CHECK,ERR08>
bad module/chip at: U683
System operation cannot continue, will test location anyway.
parity error at 23722d0 is transient.
panic: dvma parity error
esp0: Unrecoverable DMA error on dma send
sd0: SCSI transport failed: reason 'tran_err': retrying command
The system then rebooted itself normally.
Does anyone have any idea what would cause this? Should I write it off
as a "glitch" or is it a sign of potential impending disaster?
In the tradition of this great list, I received an overwhelming number
of responses. In summary, most people agreed that it was related to a
potentially bad SIMM at slot U683. Some recommended that the chip be
replaced, since it is likely that more problems will occur. Others
suggested that it may not be a permanent problem, and that I should adopt
a wait-and-see attitude, and if it happens again, replace the SIMM.
Some said that opening up the unit and making sure the SIMMs are well
seated might be a good idea.
Many thanks to all who responded:
email@example.com (Birger A. Wathne)
Steve Elliott <firstname.lastname@example.org>
Tim Beyea <beyea@ERC.MsState.Edu>
email@example.com (Jeff Aldrich)
firstname.lastname@example.org (Celeste Stokely)
email@example.com (Mike Pearlman)
firstname.lastname@example.org (Daniel Trinkle)
email@example.com (walt klingenberg)
firstname.lastname@example.org (Frank 'Scruff' Miller)
Mike Raffety <email@example.com>
Patrick Shopbell <firstname.lastname@example.org>
Robert Haddick <email@example.com>
evan@flatiron (Evan L. Marcus)
ups!kevin@fourx.Aus.Sun.COM (Kevin Sheehan)
-- Dave Rubin Polytechnic University firstname.lastname@example.org
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:50 CDT