Summary: Failing memory / E6500 / Solaris 8

From: Geoff Lane <zzassgl_at_zoe.mcc.ac.uk>
Date: Fri Nov 14 2003 - 07:49:23 EST
It may be possible to unconfigure a CPU/memory board if we had previously
enabled Dynamic Reconfiguration and a number of other conditions apply (such
as not having interleaved memory on the board.)

The gory details can be read in
	http://docs.sun.com/db/doc/806-3984-10

Unfortunately we have not enabled DR and we do have interleaved memory.


Steven Hill says...

	You can't drop the memory board on its own, but you can drop the
	entire processor board. (That is assuming you set the options for
	hot plug on the box...)

	I have to say though, we never managed to successfully hot plug a
	board on the E6500 without it hanging, so I'd be tempted to let it
	sit.

Original question...

On Fri, Nov 14, 2003 at 10:11:24AM +0000, Geoff Lane wrote:
> We are currently waiting for an engineer to replace a faulty memory card
> which is causing the records below to stream continiously into the messages
> file. 
> 
> The system is a 24 CPU E6500 running Solaris 8
> SunOS XXX 5.8 Generic_108528-15 sun4u sparc SUNW,Ultra-Enterprise
> 
> Is there any way we could disable the memory card while we are waiting for a
> replacement?
> 
> Thanks,
> 
> Nov 14 09:36:02 XXX SUNW,UltraSPARC-II: [ID 379379 kern.notice] [AFT0]
> 	errID 0x0023026b.c7cf4fb3 Corrected Memory Error on Board 2 J3300 is Sticky
> Nov 14 09:36:02 XXX SUNW,UltraSPARC-II: [ID 196652 kern.notice] [AFT0]
> 	errID 0x0023026b.c7cf4fb3 ECC Data Bit 46 was in error and corrected
> Nov 14 09:36:02 XXX SUNW,UltraSPARC-II: [ID 412025 kern.notice] [AFT0]
> 	Corrected Memory Error on CPU20, errID 0x0023026b.c936cdca
> Nov 14 09:36:02 XXX     AFSR 0x00000000.00100000<CE> AFAR
> 	0x00000001.56c07250
> Nov 14 09:36:02 XXX SUNW,UltraSPARC-II: [ID 412183 kern.notice] [AFT0]
> 	errID 0x0023026b.ce9d796b ECC Data Bit 46 was in error and corrected
> Nov 14 09:36:02 XXX     AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
> 	Fault_PC 0xff29f250
> Nov 14 09:36:02 XXX SUNW,UltraSPARC-II: [ID 306744 kern.notice] [AFT0]
> 	errID 0x0023026b.cfae2fc9 ECC Data Bit 46 was in error and corrected
> Nov 14 09:36:02 XXX SUNW,UltraSPARC-II: [ID 526881 kern.notice] [AFT0]
> 	errID 0x0023026b.cffdd42e ECC Data Bit 46 was in error and corrected
> Nov 14 09:36:02 XXX     AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00
> 	Fault_PC 0xff31f250
> 
> 
> -- 
> /\ Geoff. Lane. /\ Manchester Computing /\ Manchester /\ M13 9PL /\ England /\
> 
> "Bother", said Pooh, as the vice squad took his GIFS
> _______________________________________________
> sunmanagers mailing list
> sunmanagers@sunmanagers.org
> http://www.sunmanagers.org/mailman/listinfo/sunmanagers

-- 
/\ Geoff. Lane. /\ Manchester Computing /\ Manchester /\ M13 9PL /\ England /\

"Bother", said Pooh, as the vice squad took his GIFS
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Fri Nov 14 07:49:18 2003

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:24 EST