(LONG) SUMMARY+WARSTORY: PT wide-diff SCSI SBus Card, IBM 0664N1D Disks

From: Jochen Bern (bern@penthesilea.Uni-Trier.DE)
Date: Mon May 16 1994 - 22:22:51 CDT

Well, it has been several Weeks by now, but Things seemingly are starting
to be *predictable* at least.

Hardware we're talking about: SS2 Clone, internal 1GB Disk, original SUN
CD-ROM Drive, Pizzabox with 4 IBM 0664N1D Disks for wide-diff SCSI,
Performance Technologies SBS440A D wide-diff SCSI SBus Card. As
most of you will hopefully remember, I had Problems both with the
format Params for these Drives and with the Driver for the SBus Card.

More precisely, I was able to format the Drives (with a preliminary Set
of Params given to me by my Vendor), but whenever I tried to newfs them,
I experienced a panic: Data fault in various Stages of the Process.

The first Thing I did was to start mailbombing John DiMarco (thanks John!)
who was unfortunate enough to reply to my initial format param Request.
This helped my Understanding of how such Params get used a Lot. To
summarize the Highlights:

1) Why not ask the Drive itself what Params it would like to be set to?
   Get ftp.cdf.toronto.edu:/pub/scsiinfo/scsiinfo-3.2.shar.gz and it'll
   tell you!
2) If you have a second Param Source, you will most probably find
   several Differences. Don't worry, you don't need to match correct (tm)
   Values except for three Things:
   First, #Heads * #Sects/Track * #Cyls should not exceed the total Number
   of Sectors from the Manufacturer's Specs (for 512 Byte/Sector, if
   several are given; If you do violate this, you'll get an Error as
   soon as you set the Params with format);
   Next, to cite John,
> Number of heads [should be exact], and number of sectors per track should
> be close to the average for that drive.
   The Differences can easily sum up to several MB!
3) 0664's are "banded", which can't be appropriately reflected by a Set
   of format Params anyway. ;-)
4) Don't worry about the Effect of different Params on the Pointers in
   your Defect List, this List gives PHYSICAL Coordinates completely
   independent of the Coordinates you'ld obtain when "computing back"
   absolute Sector Numbers with the Params.

Currently, I have the Disks reformatted with 2848 Data Cyls, 2 Alt Cyls,
2850 Phys Cyls, 15 Hds, 92 Sec/Trk, and 5400 rpm. This works as far as
I've gotten 'til now (flooding the newfs'ed Partitions with Copies of
/vmunix, waiting for the Host to crash again). Mixing this, the scsiinfo
Results, and some Guesswork with the Manufacturer's "Overhead Bytes per
Sector" Data, I end up with:

disk_type = "IBM OEM 0664N1D" \
        : ctlr = SCSI : fmt_time = 4 \
        : trks_zone = 15 : atrks = 0 : asect = 34 \
        : ncyl = 2848 : acyl = 2 : pcyl = 2850 : nhead = 15 : nsect = 92 \
        : rpm = 5400 : bpt = 56672

This gives me actually ~1/4 MB more than the scsiinfo Data, and 7.5 MB more
than the Vendor Params.(!) I tried ncyl=2850:pcyl=2852 as well, but format
didn't like these.

However, at the same Time I had to start fighting the Driver for the
wide-diff SCSI SBus Board. :-C As I said, whenever I tried a newfs, the
Host froze Rock solid. Luckily, Performance Technologies has a Tech
Support Email Address. Several Days later, they answered:

> Open up the following file /usr/sys/scsi/adapters/ptisp_conf.c and change the
> following line:
> /*
> * SBus burst size.
> */
> int ptisp_burst_size[NPTISPBUS] = {
> -1, -1, -1, -1, -1, -1, -1, -1,
> };
> to
> /*
> * SBus burst size.
> */
> int ptisp_burst_size[NPTISPBUS] = {
> 5, 5, 5, 5, 5, -1, -1, -1,
> };
> This will set the SBus communication burst size to 16 instead of the default
> 32 bit burst. Perhaps this will solve the problem by having better communica-
> tion between the SS2 clone and the SCSI card. You will have to "make the
> kernal again to have these values take any effect.

Reduce the avg Bus Speed? Not exactly welcome, but anything you say, Doc ...

It turned out that this brought newfs back to Life *for the smaller
Partitions* (500 MB and less); The big ones (1000 MB) still froze the
Host. I replied and started waiting.

After this, I found some spare Time and tried to install one of the
new Classics we've got. No Way. Uh, what's wrong? Well, I believed the
SUN Sales People when, to my Question "I need to run SS10 and Classics,
what 4.x OS Upgrade should I order", they told me "put this Part Number
on your Order". *Big* Mistake. First Time I heard that there is a SunOS
Version supporting one of SS10 and Classics but not both, it's 4.1.3
aka 1.1 and as anybody knowing Murphy could have told me, it did support
SS10 (on which I had installed the temporary Server) but not Classics
(which I had left behind for later Installation).

Fortunately, we ordered our Classics from somebody who seems to have one
or the other good Idea about OSes, two were delivered with a special 1.1
on QIC and two included 1.1.1B CDROMs. Great! I thought. Just one more
Day to upgrade everything. Let's see whether that does cure the Driver
Problems; I'll do a complete Installation on the Host with the new Disks
and reformat with those new Params while I'm at it.

OK, so I did. Installation, reformat one Disk to try Things out, reboot
just to get everything in standard State. Oooops, little Typo in Par-
titioning, won't boot - OK, let's boot single User and fix that. Now
newfs. *BINGO*! Very good. Let's reformat and newfs the other three
Drives over Night.

The next Morning, four Drives were ready for use. Um, still single User?
Alright, Ctrl-D to get up multiuser. Hmmm. Now what's responsible for
getting Things to work, 1.1.1B or single User? Nah ... must be the OS ...
Oh well, let's newfs one Partition again just to make sure.


Huh what? newfs works OK in single but crashes the Host in multi?!?
What will I experience once I start *using* the Disks?????
Let's try mounting one Partition, and flooding it with Copies of /vmunix,
maybe there will be some less cryptic Error Messages ...

Five Days later (i.e., right now), the Host has made 38 Passes over all
Partitions with FSes, flooding them with Kernel Copies 'til they're at
100% and rm'ing them again with every Pass, and there was no Error.
Seems that the Driver only barfs when asked to do a newfs in multiuser
Mode, but not when dealing with Files, formatting, or running single
User. In Addition, there's this SBus Burst Size Stuff *somehow* affecting
this, and the Point where the Driver freezes the OS varies from Crash to
Crash, too. I think about taking Lessons in black Magic, anyone knows a good
Teacher? :-S

Thanks to:
        John DiMarco <jdd@db.toronto.edu>
        Paul Walters <walters@prl.philips.co.uk>
        Mike Raffety <mike_raffety@il.us.swissbank.com>
        Dan Stromberg <strombrg@hydra.acs.uci.edu>

                                                                        J. Bern

__/\_____________________________________________   ___________________________
 /  \                                            \ /                        /\
/ J. \ EMail: bern@[TI.]Uni-Trier.DE / ham: DD0KZ X  More Infos on me from /  \
\Bern/ X.400: <---- temporarily disabled ---->   / \  the X.500 Directory; \  /
 \  /  P. O. Box 1203, 54202 Trier, Germany     /   \  Pub Keys via finger  \/
__\/___________________________________________/ EOF \_________________________

This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:09:01 CDT