SUMMARY: Problems with Sabre 1230 on rimfire, Sun-3

From: Russ Poffenberger (poffen@sj.ate.slb.com)
Date: Sat Nov 16 1991 - 12:07:10 CST


Thanks to those that responded (only 2, but I got helpful information). The
problem appears to be resolved. I ended up double checking all the sector
switches on the drive, setting them to Ciprico recommended settings, and using
the latest driver. I think what was happening is that the defect list was
getting mis-interpreted, and not actually mapping out the bad blocks, but good
ones, leaving the bad blocks around, hence problems later on.

Thanks to those who responded.

runet%gest48@gest20.SINet.SLB.COM (Rune Trengereid)
curt@ecn.purdue.edu (Curt Freeland)

Here are the responses

-------------

From: curt@ecn.purdue.edu (Curt Freeland)
To: poffen@sj.ate.slb.com
Subject: Re: Problems with Sabre 1230 on rimfire, Sun-3

The manufacturers list should find everything (down to single bit errors)
as most of them use analog test setups to test the HDA. The electronics
in the drive are not nearly as sensitive, and therefore may "never see" some
of the bad spots in the list. Particularly if you have a controller which
has decent ECC logic. I have seen controllers that ignore errors up to
11 bits before they will declare a sector bad.

Did you actually get a new HDA, or some third party rebuild? The last time
I spoke to Seagate, they said nobody was authorized to rebuild those HDA's
but Seagate. Many of the repair houses I have spoken to claim to be
authorized by Seagate. Someone is not being too honest out there.

I have had two of the 9720 series drives fail with problems I could have
sworn were HDA related. When I sent them to Seagate for HDA replacement,
I got a surprise. All they replaced were some electronics (interface board
on one, and logic PCB on the other). Both of my drives had multiple
bad spots appear on one head, then things got ***very*** random. It looked
like a soft crash, then pieces flew around in the HDA to crash the rest of it.

I was very shocked to find that electronics could do such a good job of
emulating an HDA failure. If you have another drive, you might try swapping
one board at a time between them. The board that handles servo stuff may
be touchy. If the drives are not similarly aligned, the boards may not
be interchangable. (I have not found this to be a problem with Seagate
9720 drives, but it could happen one day.)

Then again you may have received another bad HDA.

If you are interested, I can send you a PD source for a program I have
called disktest. It was written back in the PDP-11 v6 Unix days by George
Goble of our Unix group. I have found it useful on Suns/4.XBSD and many
other Unix variants. It can be set to write/read patterns (random, or
supplied by user). It will not "fix" the bad spots, but you can use it
to find the bad spots. It comes with no guarantee...but it has been very
useful to me.

Good luck with it

----------------

From: runet%gest48@gest20.SINet.SLB.COM (Rune Trengereid)
To: poffen@sj.ate.slb.com
X-Vms-From: GEST48::runet "Rune Trengereid"
X-Vms-To: sj.ate.slb.com::poffen
Subject: Problems with Sabre 1230 on rimfire, Sun-3

Cc: gest20.sinet.slb.com::runet

Hello,

 The problem you are describing have I experienced many times. My suggestion is that you try to replace the rimfire
controller. If you don't have any spare you can try to swap the ports on which
you
have the drives connected to, i.e. disk #1 which is now connected to port A
should be connected to port B (remember to change unit no.) and disk #2 which is now connected to port B should be connected to port A.

Even if the controller is working allright on one port it is not necessarily
working on the other ( which I have experienced) . Together with all the other
things you have done so far you should now be able to decide whether you still
got a bad HDA.

--------------

The original question..

-------------

I am having a problem with a Sun 3/260, 32M memory, and a Ciprico Rimfire
3200 controller. The OS is 4.0.3, running version 2.0 of the Ciprico drivers.

The controller supports two CDC (before Seagate) Sabre-V 9720-1230 SMD disks.
One of the disk is fine, the other has had man baffling problems with bad
sectors.

What will happen is that bad sectors (many hard errors, non-correctable)
keep popping up. I have formatted the drive several times, but no luck. It
generally seems that it is the same blocks each time, but sometimes they are
different, making it difficult to determine if the HDA, or something else
is at fault.

Here is what I have done so far.

1.) Had the HDA replaced.

2.) Isolated the machine only with the drive causing problems connected.

3.) Used the cables from the good drive.

4.) Used the power supply and interface from the good drive.

Nothing I have done has made any difference. Is it safe to assume that the
original manufacturers defect list should point out ALL defects on a new HDA?
I can't believe that there would be so many more defects on a new HDA than
are listed in the defect list. The verify option in rfutil is quite lacking
when it comes to actually exercising the disk. If I could run Sun's format,
at least I could exercise different patterns read/write. Currently I dd or
mkfile the disk full and try to read it back, it picks up many more bad
blocks than verify did in rfutil, but is difficult to work with.

Could I have gotten another bad HDA? Seems strange, but that is all I can
think of.

-------------

Russ Poffenberger DOMAIN: poffen@sj.ate.slb.com
Schlumberger Technologies UUCP: {uunet,decwrl,amdahl}!sjsca4!poffen
1601 Technology Drive CIS: 72401,276
San Jose, Ca. 95110 Voice: (408)437-5254 FAX: (408)437-5246



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:16 CDT