SUMMARY: witch DIMM is bad

From: Carl Carpenter (carlc@kivex.com)
Date: Thu Aug 20 1998 - 12:16:28 CDT


I usually do not summarize until I have recieved the answeres, done
maintainence and
confirmed that it works. But, i have recieved 3 replies giving me the
same answer and
several requests to please relay the answer. I have no doubt that I have
gotten the
answer I was looking for, so here is the summary.

Thanks go to
    Martin D Baldenegro
    Christopher L. Barnard
    Lars Balker Rasmussen

----------------------------------------

ANSWER: (I have only quoted one answer since they were all the same)

  The SPARC 5 memory is aligned in the following manner.
  From your output it looks like the bad DIMM is in location
  J0303 as the MFAR address fall within that range.

J0403 7th 0e000000-0fffffff
J0402 6th 0c000000-0dffffff
J0401 5th 0a000000-0bffffff
J0400 4th 08000000-09ffffff
J0303 3rd 06000000-07ffffff

MFSR=81802860 MFAR=645c078
        thru
MFSR=81802860 MFAR=78b3220

J0302 2nd 04000000-05ffffff
J0301 1st 02000000-03ffffff
J0300 0th 00000000-01ffffff

  Hope this helps,

----------------------------------------

QUESTION:

I believe we have a bad DIMM in our news server and i need to know how
to figure out witch
one it is. The last time we saw this on a different system it only had
64MB and found the bad
DIMM by elimination (pulled one DIMM out and the error still occurred,
switched DIMMs and
error never occurred again). This system has 8 DIMMs and is a production
machine, so i can not
afford to find it by elimination.

Our system is a SPARC 5/170 with 256MB ram (8x32MB DIMMs)

The errors we are seeing in /var/adm/messages
    Aug 18 00:55:25 news unix: panic: asynchronous memory fault:
MFSR=81802860 MFAR=645c078
    Aug 18 19:55:09 news unix: panic: asynchronous memory fault:
MFSR=81802860 MFAR=645c078
    Aug 18 19:55:09 news unix: panic: asynchronous memory fault:
MFSR=80802860 MFAR=7c5872c
    Aug 19 02:16:29 news unix: panic: asynchronous memory fault:
MFSR=81802860 MFAR=7a18e60
    Aug 19 03:32:37 news unix: panic: asynchronous memory fault:
MFSR=80802860 MFAR=7a55e08
    Aug 20 03:41:32 news unix: panic: asynchronous memory fault:
MFSR=80802860 MFAR=7a55e08
    Aug 20 03:41:33 news unix: panic: asynchronous memory fault:
MFSR=81802860 MFAR=78b3220

Can someone tell me witch DIMM is giving this error and causing the
server to reboot?

Thanks and will summarize

carlc@kivex.com



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:46 CDT