SUMMARY: E420R Reboot - Uncorrectable Memory Error

From: Tan Kian Chye <kianchye.tan_at_xatmi.com>
Date: Wed Dec 17 2003 - 22:55:42 EST
Hi,

Cause of the problem is CPU3 is having issue on command for ecache to be
copied to CPU0.

Another problem highlighted was the patches of OS need to be patch to
release -27.
Many thanks to following who reply:

	Rich Bonfoey
	Willie Flint
	Jay Lessert
	Kevin Raber
	Murray Robert
	Place, Richard
	Amiel Lee Yee
	Joohyun Cha


Regard,
Tan Kian Chye


-----Original Message-----
From: Tan Kian Chye [mailto:kianchye.tan@xatmi.com]
Sent: Wednesday, December 17, 2003 10:10 AM
To: 'sunmanagers@sunmanagers.org'
Subject: E420R Reboot - Uncorrectable Memory Error


Hi,
One of our systems crahsed/rebooted today and I just wanted to make sure i
am heading in the right direction. This is the first time this has happened
in our environment.

Some suggestions in the WWW actually mentioned this could be problem in the
CPU, but i just wanted to post the info here to see if anyone had any
addidtional information on what i can do to track or perhaps fix this
problem.

Is this CPU or a Memory Problem ?? Maybe both ?


Below is the /var/adm/messages:
------------------------------------------------------------ ---------------
----------------
Dec 16 01:58:33 xa-ora-fin SUNW,UltraSPARC-II: [ID 424925 kern.warning]
WARNING: [AFT1] Uncorrectable Memory Error on CPU0 Data access at TL=0,
errID 0x00111411.65de423d
Dec 16 01:58:33 xa-ora-fin AFSR 0x00000000.80200000<PRIV,UE> AFAR
0x00000000.9ae64db8
Dec 16 01:58:33 xa-ora-fin AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
Fault_PC 0x100cb160
Dec 16 01:58:33 xa-ora-fin UDBH 0x00b2 UDBH.ESYND 0xb2 UDBL 0x0203<UE>
UDBL.ESYND 0x03
Dec 16 01:58:33 xa-ora-fin UDBL Syndrome 0x3 Memory Module U1304 U0304 U1303
U0303
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 954695 kern.warning]
WARNING: [AFT1] errID 0x00111411.65de423d Syndrome x3 indicates that this
may not be a memory module problem
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 999597 kern.info] [AFT2]
errID 0x00111411.65de423d PA=0x00000000.9ae64db8
Dec 16 01:58:34 xa-ora-fin E$tag 0x00000000.0a40135c E$State: Shared
E$parity 0x05
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x00): 0x00000000.00000000
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x08): 0x00000000.00000021
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x10): 0x00000000.00000000
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x18): 0x00000000.00000000
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x20): 0x00000000.00000000
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x28): 0x00000000.0017c527
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x30): 0x00000000.0017cb5f
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2]
E$Data (0x38): 0x00000000.0000066a *Bad* PSYND=0x00ff
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 339554 kern.warning]
WARNING: [AFT1] CP event on CPU3 (caused Data access error on CPU0), errID
0x00111411.65de423d
Dec 16 01:58:34 xa-ora-fin AFSR 0x00000000.01000008<CP> AFAR
0x00000000.9ae64db8
Dec 16 01:58:34 xa-ora-fin AFSR.PSYND 0x0008(Score 95) AFSR.ETS 0x00
Dec 16 01:58:34 xa-ora-fin UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000
UDBL.ESYND 0x00
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 999597 kern.info] [AFT2]
errID 0x00111411.65de423d PA=0x00000000.9ae64db8
Dec 16 01:58:34 xa-ora-fin E$tag 0x00000000.1b40135c E$State: Owner E$parity
0x0d
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x00): 0x00000000.00000000
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x08): 0x00000000.00000021
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x10): 0x00000000.00000000
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x18): 0x00000000.00000000
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x20): 0x00000000.00000000
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x28): 0x00000000.0017c527
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2]
E$Data (0x30): 0x00000000.0017cb5f
Dec 16 01:58:34 xa-ora-fin SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2]
E$Data (0x38): 0x00000000.0000066a *Bad* PSYND=0x0008
Dec 16 01:58:34 xa-ora-fin unix: [ID 836849 kern.notice]
Dec 16 01:58:34 xa-ora-fin ^Mpanic[cpu0]/thread=30004623000:
Dec 16 01:58:34 xa-ora-fin unix: [ID 787147 kern.notice] [AFT1] errID
0x00111411.65de423d UE Error(s)
Dec 16 01:58:34 xa-ora-fin See previous message(s) for details
/scsi failed
Dec 16 13:23:01 xa-ora-fin unix: [ID 389951 kern.info] mem = 4194304K
(0x100000000)
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Wed Dec 17 22:55:30 2003

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:25 EST