Hi , My problem is still not solved , I got few replies which suggest that it is a hardware problem only .I will call sun when my machine reboots next time( it hasn't rebooted since I posted this message) . I am sending this summary to share the suggestions I got .I will post another summary when my problem is completely solved . Thanks to all those who replied . I got replies from roman.pestka@CommerzbankIB.com, pmora@cgob.junta-andalucia.es, joe.fletcher@metapack.com, willief@base-2.com roman suggested that "This is a reboot initiated by the improved kernel Ecache error handling, it is most likely not a memory issue but a CPU issue." and "this is a hardware issue, nothing to do with a PCI card." pmora had faced similar problems with a 450 , got his cpu replaced thrice before his problem got solved, but he dont beleive that all his previously replaced cpu were bad . joe fletcher told me to install latest GLM patches if there a symbios logic diff scsi card was installed on my machine . Well I dont have any such card in my system . willief suggested to try out few things like Check to make sure memory modules are seated perfectly into their sockets. He doubted that motherboard could be faulty. He also beleived that PCI card could not cause this problem. Although he later suggested me to give it a try(replacing or taking out the pci card) . He writes "The last thing I believe you mentioned was a PCI card. You never know. Although the PCI bus is isolated from memory via the PCI interface chip on the motherboard. Its worth a try." Thanks Surender My original posting > Hello Managers, > > I have an ultra60 machine that used to reboot with some memory errors. > I called up sun , they tried to troubleshoot the issue and eventually > they changed everything including system board , cpu ,memory and eveen > the power supply! And OS was also reloaded . But the problem did not > solve even after that . Have anybody of you ever faced such a problem? > As all my hardware is changed I dont want to beleive this could be a > hardware issue, Is there any patch available for this problem ? I have > another doubt in my mind about a PCI card which is installed in this > system, could the PCI card be the culprit? I cant think beyond patches > and that little PCI card . Please help me . FYI this error occurs very > randomly with just about any process , the frequency of the error is > also not fixed . this time it happened after a month but repeated within > an hour. > > Please help me > Will summarize > > Thanks > Surender > > Errors shown in /var/adm/messages > _______________________________________________________________________ > > Jan 4 11:19:40 jughead unix: WARNING: [AFT1] Uncorrectable Memory Error > on CP > U0 Data access at TL=0, errID 0x00007d45.c379075d > Jan 4 11:19:40 jughead AFSR 0x00000000.00300000<UE,CE> AFAR > 0x00000000.09 > 1dec88 > Jan 4 11:19:40 jughead AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 > Fault_PC > 0x1625fc > Jan 4 11:19:40 jughead UDBH 0x0164<CE> UDBH.ESYND 0x64 UDBL > 0x03ed<UE,CE> > UDBL.ESYND 0xed > Jan 4 11:19:40 jughead UDBL Syndrome 0xed Memory Module U0701 U0702 > U0703 > U0704 > Jan 4 11:19:40 jughead unix: [AFT2] errID 0x00007d45.c379075d > PA=0x00000000.0 > 91dec88 > Jan 4 11:19:40 jughead E$tag 0x00000000.18c00123 E$State: Exclusive > E$par > ity 0x0c > Jan 4 11:19:40 jughead unix: [AFT2] E$Data (0x00): 0xd8c2c4c2.c8c2c4c2 > Jan 4 11:19:40 jughead unix: [AFT2] E$Data (0x08): 0xccc2c4c2.c8c2c402 > *Bad* > PSYND=0x00ff > Jan 4 11:19:40 jughead unix: [AFT2] E$Data (0x10): 0xd0c2c4c2.c8c2c4c2 > Jan 4 11:19:40 jughead unix: [AFT2] E$Data (0x18): 0xccc2c4c2.c8c2c4c2 > Jan 4 11:19:40 jughead unix: [AFT2] E$Data (0x20): 0xd4c2c4c2.c8c2c4c2 > Jan 4 11:19:40 jughead unix: [AFT2] E$Data (0x28): 0xccc2c4c2.c8c2c4c2 > Jan 4 11:19:40 jughead unix: [AFT2] E$Data (0x30): 0xd0c2c4c2.c8c2c4c2 > Jan 4 11:19:40 jughead unix: [AFT2] E$Data (0x38): 0xccc2c4c2.c8c2c4c2 > Jan 4 11:19:40 jughead unix: [AFT2] E$Data (0x38): 0xccc2c4c2.c8c2c4c2 > Jan 4 11:19:40 jughead unix: NOTICE: Scheduling clearing of error on > page 0x0 > 0000000.091de000 > Jan 4 11:19:40 jughead unix: [AFT3] errID 0x00007d45.c379075d Above > Error is > in User Mode > Jan 4 11:19:40 jughead and is fatal: will reboot > Jan 4 11:19:40 jughead unix: WARNING: [AFT1] initiating reboot due to > above e > rror in pid 12253 (verilog.exe) > Jan 4 11:19:45 jughead unix: NOTICE: Previously reported error on page > 0x0000 > 0000.091de000 cleared > Jan 4 11:19:56 jughead syslogd: going down on signal 15 > Jan 4 11:20:15 jughead unix: automountd not running, retrying > Jan 4 11:20:24 jughead unix: syncing file systems... > Jan 4 11:20:24 jughead unix: done > _______________________________________________ > sunmanagers mailing list > sunmanagers@sunmanagers.org > http://www.sunmanagers.org/mailman/listinfo/sunmanagers _______________________________________________ sunmanagers mailing list sunmanagers@sunmanagers.org http://www.sunmanagers.org/mailman/listinfo/sunmanagersReceived on Wed Jan 16 06:07:21 2002
This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:32 EST