SUMMARY: panic assertion failed: *bnp != UFS_HOLE

From: Andrew Patrick (andrew@calvin.dgbt.doc.ca)
Date: Thu Oct 15 1992 - 02:02:50 CDT


The orginal question was:

I am having problems with my SPARCstation 1+ (64M RAM, SunOS 4.1.1).
The computer repeatedly crashes (panics) at random times (about once
per day), often giving the error message:

        assertion failed: *bnp != UFS_HOLE, FILE: ../../ufs/ufs_bmap.c,
        line:326
        panic assertion failed

After these crashes, the machine is very difficult to re-boot. Often
the computer fails the initial tests, giving messages like:

        reg nnn bad serr

where "nnn" is some number. At other times, the reboot sequence
results in error messages like "vmunix: truncated file". Turning off
the power for about 2 minutes seems to allow the system to boot, but it
could be just superstition on my part. The machine will stay up longer
when only 48M of RAM is installed, but it still crashes.

I tried running sundiag tests on the physical memory. After 1.5 hours,
sundiag stopped with the following message in the error file:

 10/05/92 18:09:28 calvin.dgbt.doc.ca sundiag INFO: *Start all tests*
 10/05/92 19:38:32 mem pmem FATAL: segmentation violation(11),
     code= 3, Address= 0x1010f010
 10/05/92 19:38:33 calvin.dgbt.doc.ca sundiag INFO: *Failed test*
    (mem) pmem passes: 79 errors: 1
 10/05/92 19:38:34 calvin.dgbt.doc.ca sundiag INFO: *Stop all tests*

There were no other messages in the log files.

I also ran a home-made memory test program we use here. This program
simply mallocs large chunks of memory, and then reads and writes to it
with different patterns (as opposed to sundiag which apparently only
reads). This program is quite good at forcing parity errors in bad
RAM, but it did not produce any here.

I have also tried two other sets of RAM, and get the same behaviour.
Further, I have tried increasing and decreasing the swap space, and
using swap partitions or swap files, but the problems are the same.

Does anyone have any experience in something like this? Could we be
looking at a bad Memory Management Unit (MMU) here?

--------

The responses I got ranged all over the place. Here is a list, with my
own thinking:
- memory not up to spec
        I have tried 3 sets of memory and always get the same results.
        The current set is supposed to meet Sun's specs for memory for
        SPARCstation 2's.
- probems with dynamic swap space
        I am now only using swap partitions, so comments about making
        dynamic swap files are not relevant.
- bug in the OS
        This has been mentioned by others, but no one has been able to
        provide a specific description or reference.
- problems with hard disk
        Could be, but the disk is showing no other symptoms.
- heat sensitivity
        Could be, but running the Pizza Box open does not help.
- motherboard problems
- bad CPU or MMU
        I think this is the most probable, and am looking into
        replacing the mother board.

Here are the actual responses:

>From eckhard@ts.go.dlr.de Fri Oct 9 03:34:39 1992
Date: Fri, 9 Oct 92 08:34:05 +0100
From: Eckhard.Rueggeberg@ts.go.dlr.de
Subject: Re: panic assertion failed: *bnp != UFS_HOLE

Sounds like a motherboard exchange (no problem if you are in maintenance).

Eckhard R|ggeberg
eckhard@ts.go.dlr.de

--------
>From miker@sbcoc.com Fri Oct 9 11:42:37 1992
Date: Fri, 9 Oct 92 09:49:41 CDT
From: Mike Raffety <miker@sbcoc.com>
Subject: Re: panic assertion failed: *bnp != UFS_HOLE

Where'd you get the memory from? Maybe it's not quite up to spec?
Sun qualifies its memory with a couple of specific characteristics.
Can you try it with Sun-only memory?

Please be sure to summarize back to the list; thanks.

--------
>From kla!brandari%sunra@Sun.COM Fri Oct 9 12:45:55 1992
Date: Fri, 9 Oct 92 08:47:40 PDT
From: kla!brandari%sunra@Sun.COM (Paul Brandariz x6546)
Subject: Re: panic assertion failed: *bnp != UFS_HOLE

Andrew

        Are you using dynamic swap ? If you created your swapfile with

                mkfile -n swapfile

        these behavior will occur.

make a new swapfile with space allocated on it (ie. No Holes )
___________________________________________________________________________
Paul R. Brandariz E-mail Internet: paul.brandariz@kla.com
KLA Instruments Corp
P.O. Box 49055 Voice: (408) 456-6546
San Jose, CA 95161-9055 Fax: (408) 434-4273
___________________________________________________________________________

--------
>From kamaz.kazan.su!Anatoly.Lisovsky@relcom.kiae.su Fri Oct 9 15:16:30 1992
Date: Fri, 9 Oct 92 21:04:51+0400
From: Anatoly.Lisovsky@kamaz.kazan.su ()
Subject: Re: panic assertion failed: *bnp != UFS_HOLE

check hard disk

- Anatoly M. Lisovsky, Kamaz Inc. KamaFlash SMG Chairperson
------------ Network is computer. Per aspera ad Sun! ------------

--------
>From Perry_Hutchison.Portland@xerox.com Fri Oct 9 14:40:18 1992
Date: Fri, 9 Oct 1992 11:39:47 PDT
From: Perry_Hutchison.Portland@xerox.com
Subject: Re: panic assertion failed: *bnp != UFS_HOLE

> reg nnn bad serr
>
> where "nnn" is some number.

This is the self-test reporting a bad register. Almost certainly a
hardware problem.

> Turning off the power for about 2 minutes seems to allow the system to boot

Could be heat-sensitive.

> I have also tried two other sets of RAM, and get the same behaviour.

Not surprising -- it's a CPU or MMU problem, not a RAM problem.

> Could we be looking at a bad Memory Management Unit (MMU) here?

Quite possibly -- in the case of an SS1+ I think the usual recourse
would be to swap the entire CPU board, so you really don't need to
worry too much about just what has gone bad.

--------
>From bukys@cs.rochester.edu Mon Oct 12 10:35:40 1992
Subject: Re: panic assertion failed: *bnp != UFS_HOLE
Date: Mon, 12 Oct 92 10:35:17 -0400
From: bukys@cs.rochester.edu

I believe this is an OS bug. We saw it here when people were doing
mmap()ed access to a VERY sparse file. I don't know if there is a
patch for it.

-- 
Andrew Patrick, Ph.D.       Communications Research Centre, Ottawa, CANADA
                       andrew@calvin.dgbt.doc.CA
                       
  For a good time, run "telnet debra.dgbt.doc.ca" and login as "chat".



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:51 CDT