Summary : a system error (BENDRISS Elmehdi)

From: BENDRISS Elmehdi <e.bendriss_at_menara.ma>
Date: Thu Jun 26 2003 - 06:57:50 EDT
Thanks to : Justin Stringfellow who gave me full explanation of the
problem; here is his mail

2 warnings are raised:
CPU0 first, it fails writing data back to cache, the parity check fails
on the data being written back - this is known as a writeback
parity or "WP" event:
Jun 25 23:13:29 ms1 unix: WARNING: [AFT1] WP event on CPU0, errID
0x000a52a2.24f3553f
Jun 25 23:13:29 ms1 unix: AFSR 0x00000000.00800004<WP> AFAR
0x000001fe.01800f00
Jun 25 23:13:29 ms1 unix: AFSR.PSYND 0x0004(Score 95) AFSR.ETS 0x00
Fault_PC 0x100662c0
Jun 25 23:13:29 ms1 unix: UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000
UDBL.ESYND 0x00
CPU2 complains too; it owns the cache that is being written to, and
notices that someone is trying to write crap into it's cache:
Jun 25 23:13:38 ms1 unix: WARNING: [AFT1] Uncorrectable Memory Error on
CPU2 Data access at TL=0, errID 0x000a52a4.417a572b
Jun 25 23:13:38 ms1 unix: AFSR 0x00000000.80200000<PRIV,UE> AFAR
0x00000000.fc0662e8
Jun 25 23:13:38 ms1 unix: AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
Fault_PC 0x10020f98
Jun 25 23:13:38 ms1 unix: UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0203<UE>
UDBL.ESYND 0x03
Jun 25 23:13:38 ms1 unix: UDBL Syndrome 0x3 Memory Module U1404 U0404
U1403 U0403
The key piece of information is the "score". We see under CPU0 there is
a "Score 95". This is the kernel decoding the asynchronous fault
address register bits ("AFAR") and asynchronous fault status register
bits ("AFSR") and deciding who is the culprit, and who is the
victim. It gives a higher score (from 0-100) to a more likely culprit.
It would be easy without this knowledge, to think that CPU2 had a
problem as well - but it doesn't.

                                         /////
                                        (o o)
-----------------------------------ooO---(_)---Ooo----------------------
-------------

Elmehdi BENDRISS
Administrateur Systhme                         email  :
e.bendriss@menara.ma
Direction Internet                             Tel    :   +212 37 71 88
73
Maroc Telecom                                  GSM    :   +212 61 47 47
09
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Thu Jun 26 07:00:48 2003

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:15 EST