Managers,
A late summary. Thanks very much to all those who took the time to answer.
Only one answer was actual action bearing from Hal Stern.
I have not done anything and the system is up since - touch wood.
I have included all answers since some are amusing.
>- Oran
>Latly one of my user's IPC paniced. Then rebooted and is now up for two days.
>Any pointers to the nature of problem (patch?) appreciated.
>
> >- Oran
>.....
>DVMA Parity Error, ctx = 0x0, virt addr = 0xfff14bc0
>pme = e2000973, phys addr = 973bc0
>Parity Error Register 91<ERROR,CHECK,ERR24>
> bad module/chip at: ?
>System operation cannot continue, will test location anyway.
>parity error at 973bc0 is transient.
>panic: dvma parity error
>esp0: Unrecoverable DMA error on dma send
>sd0: SCSI transport failed: reason 'tran_err': retrying command
>syncing file systems... esp0: Target 3 now Synchronous at 4.167 mb/s max transmit rate
>......
--------------------------------------------------
From: celeste@stokely.mtview.ca.us (Celeste Stokely)
To: spg.amdahl.com!oran@netcom.com
Subject: Re: Whay did we panic?
You panic'd because of a random memory parity error. If it only happens
once, chalk it up to cosmic rays. If it continues, you may have a simm that
has worked loose, or a bad simm.
..Celeste Stokely
--------------------------------------------------
From: ron@arfalas.horizon.com (Ron McDaniels)
To: oran@spg.amdahl.com
Subject: Re: Whay did we panic?
Cc: brummp@spg.amdahl.com
Very familiar. I installed a SIMM that had a flakey chip. Memory parity
errors are recoverable if detected by the CPU; fatal if during I/O dma.
My problems went away after I found and replaced the bad SIMM (found it
on a Single board computer my company makes. We make a decent diagnostic).
Ron McDaniels
--------------------------------------------------
From: admin%esrg@hub.ucsb.edu (system administrator)
To: oran@spg.amdahl.com
Subject: Re: Whay did we panic?
If you find out why, could you mail me?
I think mine did the same thing....
following from /var/adm/messages
:
--------------------------------------------------
From: hobbit@ftp.com (*Hobbit* )
Reply-To: hobbit@ftp.com
Sender: hobbit-e@ftp.com
Repository: babyoil.ftp.com
Originating-Client: cukes
Gamma rays from the planet Mongo
_H*
--------------------------------------------------
From: stern@sunne.East (Hal Stern - NE Area Systems Engineer)
To: oran@spg.amdahl.com
Subject: Re: Whay did we panic?
bug in the esp driver.
get patch 100343-01, which fixes this and gets you
1.3Gbyte disk features.
--hal
--------------------------------------------------
From: Brendan Kehoe <brendan@cs.widener.edu>
To: oran@spg.amdahl.com
Subject: Re: Whay did we panic?
Newsgroups: widener.mail.sun-managers
In-Reply-To: <ko02fhINN7vm@cs.widener.edu>
Organization: Widener University Computer Science Dept, Chester PA
Cc:
The parity error message means one of your memory chips (probably,
possibly a different one given the `at chip: ?' msg) has gone bad.
Give your hardware support a call.
--------------------------------------------------
From: kevin@toad.com
No patch involved as far as I can see - that was a parity error while
doing DMA for the disk. That's hardware mate :-)
l & h,
kev
--------------------------------------------------
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:36 CDT