SUMMARY: Solaris 2.6 & Disksuite 4.0 Panic: Data fault

From: Manuel Gil (mgil@reach.com)
Date: Tue Sep 09 1997 - 23:24:09 CDT


Greetings Managers,
Many thanks to the following experts for their input:
Sydney Weinstein <syd@myxa.com>
Scott McDermott <scottm@kcls.org>
Glenn Satchell <Glenn.Satchell@uniq.com.au>
Asim Zuberi <asim@psa.pencom.com>

In a nutshell, this was a slight case of failing to RTFM properly. Here
are the solutions to the Problem:

- Disksuite 4.0 doesn't work with Solaris 2.6 (I had the patch for 4.0
but this was irrelevant).

- Disksuite 4.1 is indeed included in the Solaris 2.6 release (I just
didn't look hard enough). The package is in the Intranet CD.

- After installing 4.1 and it's corresponding Jumbo patch, all worked fine
and dandy.

To answer a question that came up, I did have Sun working with me, and for
the money we're paying these guys they at least should have been able to
tell me that Disksuite 4.0 won't be happy with Solaris 2.6. This list if
far more effective and I am thankful for that.

Next is my original post.

Regards,
Manuel

Greetings Sun Gods,
For the past 2 days I've been dealing with this problem with little
success and, as you can imagine, I am getting a headache from it.
Here's the background info:
Sparcstation 20, 2 75 MHz CPUs, 512 MB RAM.
The original OS was Solaris 2.4 with Disksuite 3.0
The plan was to upgrade this host (which by the way is a test host) to
Solaris 2.6 and Disksuite 4.0 (I heard the 2.6 ships with Disksuite 4.1
but the CD for this was not in the package I received, so I just grabbed
the 4.0 CD from my shelves and opted for that).
The upgrade when smoothly up until the very end of it when I run into an
upgrade bug. This one causes the upgrade_script to abort abnormally
because of a syntax error in it (a 'fi' with no corresponding
'if'). A work-around is mentioned in the docs so, after applying the
remedies, the upgrade concluded normally (so it said).

I proceeded then to upgrade Disksuite. I went for a new installation,
instead of running the conversion script. This meant for me to backup my
metamirror, clear it, remove the replicas, remove the 3.0 package, and
then add 4.0 and recreate the replicas and metadevices. All of these
went OK. But now, after I mount the metamirror, I am able to read its
contents, but If I try to write something, or save a new file, the machine
panics and reboots. 'metastat' and 'fsck' show no problems. I have tried
starting from scratch and re-formatting the disks, and recreating the
metadevices, and I still see the same problem. At this point I have run
out of tricks and I turn to you for suggestions, pointers, etc.

I enclose the panic messages. I will summarize. Many thanks.

=============================================================================
BAD TRAP: type=9 rp=fbf739dc addr=39 mmu_fsr=126 rw=1
sched: Data fault
kernel read fault at addr=0x39, pme=0x0
MMU sfsr=126: Invalid Address on supv data fetch at level 1
pte addr = 0xf5984000, level = 1
pid=0, pc=0xf004b2e0, sp=0xfbf73a28, psr=0x401005c5, context=0
g1-g7: f004b4f4, 3c, f5d3f488, f5d3f488, 0, 1, fbf73e80
Begin traceback... sp = fbf73a28
Called from f0059dc0, fp=fbf73aa0, args=0 1 0 1 f004b2dc f025e3c0
Called from f63389f4, fp=fbf73b00, args=1 1 ffdfb9f4 8c519 9a 8
Called from f633a0f8, fp=fbf73b60, args=1 0 20 fffffffe 0 f0985a80
Called from f0090c40, fp=fbf73bc0, args=2 f5de6938 f6274570 f5de6b2c 0 0
Called from f5de29e4, fp=fbf73c30, args=f5dc5c98 f0279818 20 f5d565e4 f5d92400 f61b28a8
Called from f0090c40, fp=fbf73c90, args=f5d497b8 f5de290c f5d926e0 f5dc5c98 f61b2890 f5d926b8
Called from f5dcdd3c, fp=fbf73d00, args=f5d497b8 f0279818 20 222e00 f5daf6d0 8
Called from f5cc7d58, fp=fbf73d60, args=10000 f5d497b8 f5daeb34 f5daea90 0 1
Called from f5cc9898, fp=fbf73dc0, args=f597e924 f597e918 f5daf6d0 f0281898 10 0
Called from f004336c, fp=fbf73e20, args=f5da2558 f597e918 f5cd1a08 f5da2558 1 f5da2e78
Called from f00711f4, fp=fbf58d60, args=40400ae1 40400ae1 0 404000e1 0 200
End traceback...
panic: Data fault
syncing file systems...panic: panic sync timeout
 6737 static and sysmap kernel pages
  140 dynamic kernel data pages
  404 kernel-pageable pages
    0 segkmap kernel pages
    0 segvn kernel pages
    0 current user process pages
 7281 total pages (7281 chunks)
 
dumping to vp f623ce74, offset 761019
7281 total pages, dump succeeded
rebooting...
Resetting ...

______________________________________________________
Manuel Gil
Network Operations Manager / System Exorcist
Reach Networks, Inc.
New York, NY
Voice : 212-274-0444
Fax : 212-274-0695
Email : mgil@reach.com
WWW-mine: http://www1.reach.com/~mgil
____________________________________________ObSig1.2__



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:12:03 CDT