This summary is quite late because it took a long time to make sure my solution was okay. ----- original query ------------- > We have an ultra-10 that has an A1000 attached (via a PCI differential > SCSI card). Over the last few days, the machine has rebooted several times > with the attached error below.Is this a problem with the system memory, CPU > or the PCI card? Or something entirely different? > unix: panic[cpu0]/thread=2a100057d60: > unix: > simba1: PBM detected parity error. > simba1: PBM generated system error. > simba0: partiy error error caused by upa address=f1001ff8 UPA bytemask=0 > simba0: partiy error secondary error simba0: PBM detected parity error. > simba0: PBM generated system error. > pci0: PCI SERRpci0: partiy error error caused by > upa address=1fff1001ff8 UPA bytemask=1 > pci0: partiy error secondary error pci-0: generated partiy error. > unix: > unix: syncing file systems... ------------------------------------------- Thanks to Mike at Mike's list for suggesting that it was the CPU ecache bug. I will try to give as much detail as possible about this system since there doesn't seem to be a lot of info in the archives about this particular problem. At the end, I think, it was a defective CPU. <Details> Using a known good system for spare parts, I swapped the original memory. Memory swap went okay, but a day later the system completely died. No video or drive activity. The service contractor had to swap in a new system board to get it working again. Note that he tried a new CPU (lower speed though) before swapping the mobo, but no signal/activity. Not sure why the CPU change at this point didn't fix the system. With the new mobo (original CPU and memory, minus the SCSI PCI card) it ran fine for about a week with a "while (1)" loop running a few jobs. When I swapped it into production (now with the original SCSI card), it lasted for ~3 days before starting the reboot cycles. Removed it again from production, removed the SCSI PCI card but kept the system up, doing pretty much nothing. Reboot cycles stopped, making the SCSI card or the PCI riser card strong suspects. However, about a week or 10 days later, the reboot cycles started again. Finally changed the CPU. It hasn't rebooted with the above erros in 3-4 weeks. It does have a new CPU and a new system board. I will not put this back into production again. If you encounter this PBM / pci0 parity error problems, I would suggest removing it immediately from any production work and starting with a new CPU first. </Details> -mike Yahoo! Shopping - Send Flowers for Valentine's Day http://shopping.yahoo.com _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Tue Feb 18 23:31:49 2003
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:03 EST