SUMMARY: E250, Solaris 8 - Panics regularly!

From: <jhartzen_at_csc.com>
Date: Thu Dec 13 2001 - 01:59:43 EST
Thanx to Robert, Roman, Mike, Santha, Tony, and Willief.  I had the
"dreaded ecache fault".  Sun replaced the CPU because the machine is less
than a year old.

Tony Walsh gets the bonus points for explaining what to look for in the
messages.

Note: Sun came to the same conclusion only a couple of hours after I sent
them my explorers which showed that I already was on a current kernel patch
level.

Johan

You need to get the machine serviced by Sun. The CPU in slot 0 is suffering
the "dreaded Ecache"
fault and needs replacing. The "[AFT1] EDP event on CPU0 Data access at
TL=0," text is the give away
and the fault that the CPU is having is described by the following extract
from your messages
"E$State: Modified E$parity 0x05 Badlines found=3". This means there are
multiple parity errors
being detected in the CPU cache which makes the data unreliable.

Regards Tony Walsh








                                                                                                                                  
                    Johan                                                                                                         
                    Hartzenberg/G        To:     sunmanagers@sunmanagers.org                                                      
                    IS/CSC               cc:                                                                                      
                    @CSC                 Subject:     E250, Solaris 8 - Panics regularly!                                         
                    Sent by:                                                                                                      
                    sunmanagers-a                                                                                                 
                    dmin                                                                                                          
                                                                                                                                  
                                                                                                                                  
                    11/12/2001                                                                                                    
                    11:43 AM                                                                                                      
                                                                                                                                  
                                                                                                                                  




I've got a machine which is panicing fairly regularly, fortunately it is
not a production machine.  I can not yet establish any pattern as to when
it panics.

Any thoughts as to what went wrong and what I need to do to make it feel
better, please let me know!

Here are some messages file extracts from the last few panics.

Bonus points goes for explanations of the panic info!

Thanx in advance,
  _Johan



Dec  5 14:46:57 zactcsut003 SUNW,UltraSPARC-II: [ID 427843 kern.warning]
WARNING: [AFT1] EDP event on CPU0 Data access at TL=0, errID
0x00017050.db2e4aff
Dec  5 14:46:57 zactcsut003     AFSR 0x00000000.80400040<PRIV,EDP> AFAR
0x00000000.3424cb98
Dec  5 14:46:57 zactcsut003     AFSR.PSYND 0x0040(Score 95) AFSR.ETS 0x00
Fault_PC 0x1024d45c
Dec  5 14:46:57 zactcsut003     UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000
UDBL.ESYND 0x00
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 543262 kern.info]
[AFT2] errID 0x00017050.db2e4aff PA=0x00000000.3424cb98
Dec  5 14:46:58 zactcsut003     E$tag 0x00000000.0fc00684 E$State: Modified
E$parity 0x07
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x00): 0x00000300.0200d608
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x08): 0x00000300.0204fd20
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x10): 0x00000075.00000000
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 989652 kern.info]
[AFT2] E$Data (0x18): 0x00800000.00000000 *Bad* PSYND=0x0040
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x20): 0x00000000.00000000
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x28): 0x00000000.00000000
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x30): 0x00000000.00000000
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x38): 0x00000000.00000000
Dec  5 14:46:58 zactcsut003 SUNW,UltraSPARC-II: [ID 635501 kern.info]
[AFT2] errID 0x00017050.db2e4aff AFAR was derived from E$Tag
Dec  5 14:46:58 zactcsut003 unix: [ID 836849 kern.notice]
Dec  5 14:46:58 zactcsut003 ^Mpanic[cpu0]/thread=3000204fd20:
Dec  5 14:46:58 zactcsut003 unix: [ID 362866 kern.notice] [AFT1] errID
0x00017050.db2e4aff EDP Error(s)
Dec  5 14:46:58 zactcsut003     See previous message(s) for details
Dec  5 14:46:58 zactcsut003 unix: [ID 100000 kern.notice]
Dec  5 14:46:58 zactcsut003 genunix: [ID 723222 kern.notice]
000002a10044b400 SUNW,UltraSPARC-II:cpu_aflt_log+4e0 (2a10044b4be, 1,
10146b18, 2a10044b648, 2a10044b50b, 10146b40)
Dec  5 14:46:58 zactcsut003 genunix: [ID 179002 kern.notice]   %l0-3:
0000000000000000 000002a10044b710 0000000000000003 0000000000000010
Dec  5 14:46:58 zactcsut003   %l4-7: 0000000000200000 0000000000400000
000000000001147c 0000000000000005
Dec  5 14:46:58 zactcsut003 genunix: [ID 723222 kern.notice]
000002a10044b650 SUNW,UltraSPARC-II:cpu_async_error+868 (1, 2a10044b710,
80400040, 0, 640000080400040, 2a10044b8d0)
Dec  5 14:46:58 zactcsut003 genunix: [ID 179002 kern.notice]   %l0-3:
0000000000000001 0000000000000032 0000000000000000 0000000000000000
Dec  5 14:46:58 zactcsut003   %l4-7: 0000000000004208 0000000000000000
0000000000000000 0000000000000000
Dec  5 14:46:58 zactcsut003 genunix: [ID 723222 kern.notice]
000002a10044b820 unix:prom_rtt+0 (3000200cb88, 40, 3000204fd20,
2a10044bba0, 2, 300034e0aa8)
Dec  5 14:46:58 zactcsut003 genunix: [ID 179002 kern.notice]   %l0-3:
0000000000000003 0000000000001400 0000004400001602 000000001013e854
Dec  5 14:46:58 zactcsut003   %l4-7: 00005f005f5f0000 0000000000000000
0000000000000000 000002a10044b8d0
Dec  5 14:46:58 zactcsut003 genunix: [ID 723222 kern.notice]
000002a10044b970 c2audit:audit_start+f0 (d5, 1248, 3000200cb88, 75,
ff3a1c6c, 2)
Dec  5 14:46:59 zactcsut003 genunix: [ID 179002 kern.notice]   %l0-3:
0000000010245e10 00000000101314cc 000000000000ff00 00000000ecff5e64
Dec  5 14:46:59 zactcsut003   %l4-7: 00006e005f650000 0000000010412d58
000000007efefeff 0000000081010100
Dec  5 14:46:59 zactcsut003 genunix: [ID 723222 kern.notice]
000002a10044ba20 genunix:pre_syscall+18c (3000204fd20, ff250000, 0,
2a10044bba0, ff3a114c, 300034e0aa8)
Dec  5 14:46:59 zactcsut003 genunix: [ID 179002 kern.notice]   %l0-3:
0000000000000075 00000300037bc080 0000000000000002 00000000ffbefa1c
Dec  5 14:46:59 zactcsut003   %l4-7: 00000000ff3a14a4 00000000ff3e204c
00000000ffbef88c 00000000ff3a1c6c
Dec  5 14:46:59 zactcsut003 unix: [ID 100000 kern.notice]
Dec  5 14:46:59 zactcsut003 genunix: [ID 672855 kern.notice] syncing file
systems...
Dec  5 14:46:59 zactcsut003 genunix: [ID 733762 kern.notice]  3
Dec  5 14:47:00 zactcsut003 last message repeated 1 time
Dec  5 14:47:01 zactcsut003 genunix: [ID 733762 kern.notice]  2
Dec  5 14:47:11 zactcsut003 last message repeated 9 times
Dec  5 14:47:13 zactcsut003 genunix: [ID 616637 kern.notice]  cannot sync
-- giving up


Dec  8 22:58:47 zactcsut003 SUNW,UltraSPARC-II: [ID 704020 kern.warning]
WARNING: [AFT1] EDP event on CPU0 Data access at TL=0, errID
0x00010685.8b0d7b7a
Dec  8 22:58:47 zactcsut003     AFSR 0x00000000.80400040<PRIV,EDP> AFAR
0x00000000.3f41bba8
Dec  8 22:58:47 zactcsut003     AFSR.PSYND 0x0040(Score 95) AFSR.ETS 0x00
Fault_PC 0x100319a8
Dec  8 22:58:47 zactcsut003     UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000
UDBL.ESYND 0x00
Dec  8 22:58:47 zactcsut003 SUNW,UltraSPARC-II: [ID 889615 kern.info]
[AFT2] errID 0x00010685.8b0d7b7a PA=0x00000000.3f41bba8
Dec  8 22:58:47 zactcsut003     E$tag 0x00000000.0bc007e8 E$State: Modified
E$parity 0x05 Badlines found=3
Dec  8 22:58:47 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x00): 0x00000000.00000000

Dec 10 15:58:21 zactcsut003 SUNW,UltraSPARC-II: [ID 690134 kern.warning]
WARNING: [AFT1] EDP event on CPU0 Data access at TL=0, errID
0x00005567.891801de
Dec 10 15:58:21 zactcsut003     AFSR 0x00000000.80400040<PRIV,EDP> AFAR
0x00000000.2ce21f88
Dec 10 15:58:21 zactcsut003     AFSR.PSYND 0x0040(Score 95) AFSR.ETS 0x00
Fault_PC 0x100319a8
Dec 10 15:58:21 zactcsut003     UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000
UDBL.ESYND 0x00
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 612425 kern.info]
[AFT2] errID 0x00005567.891801de PA=0x00000000.2ce21f88
Dec 10 15:58:22 zactcsut003     E$tag 0x00000000.0fc0059c E$State: Modified
E$parity 0x07 Badlines found=2
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x00): 0x00000000.00000000
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 989652 kern.info]
[AFT2] E$Data (0x08): 0x00800000.00000000 *Bad* PSYND=0x0040
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x10): 0x3fffffff.f666b520
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x18): 0x00000000.008f2967
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x20): 0x00000000.10140f1c
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x28): 0x00000000.00000000
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x30): 0x000002a1.00045d40
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 359263 kern.info]
[AFT2] E$Data (0x38): 0x00000000.00000000
Dec 10 15:58:22 zactcsut003 SUNW,UltraSPARC-II: [ID 288824 kern.info]
[AFT2] errID 0x00005567.891801de AFAR was derived from E$Tag
Dec 10 15:58:22 zactcsut003 unix: [ID 836849 kern.notice]
Dec 10 15:58:22 zactcsut003 ^Mpanic[cpu0]/thread=2a100045d40:
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Thu Dec 13 01:04:12 2001

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:30 EST