SUMMARY: disk errors

From: Nils-Henner Krueger (nhk@WHU-Koblenz.de)
Date: Tue Jan 07 1997 - 06:26:17 CST


I reported a problem with disk errors on a newly
connected drive (see original question below).

About ten people came up with usefull sugestions,
thank you to everybody!

General guidelines in such cases are:

- Check overall (external AND internal) cable length.
  The "official" scsi spec is 6m, but that may allready
  be too long in practice.

- Check correct termination, both ends of the bus,
  active termination preferred.

- Get shorter and high quality external cables.

- If you want to run several hard disks (that means
  more than two or three), perhaps together with
  a DAT drive, get a second sbus scsi controller
  and plug the harddisks to it, leave the DAT drive
  on the internal controller.

I moved the fourth drive from the end of the scsi chain
to another place on the bus, and only got one single
read error so far, so that was at least an improvement,
too.

Original question:

> I'm running a sparc 10-81 with 4.1.3_U1 and I'm having some
> trouble with disk errors. Here's the story.
>
> There are four scsi devices connected to the machine. One
> internal disk, two disks and one dat drive in two external
> boxes. The scsi bus has an active external termination. This
> configuration was running for about a year without any pro-
> blems.
>
> Now I'm trying to add another disk. I put it in an external
> box together with one of the other disk drives. The disk
> was formated ok, I copied some data on it, but within the
> next few days every time I rebooted the machine (which I
> had to do quite often because of other problems) I got some
> read errors during file system check on the new disk. I
> fixed the blocks with format - repair, but it kept reporting
> new errors every time the machine came up. I declared the
> disk (NEC DSE 2010S 2GB) broken and replaced I with another
> one, now it's an IBM DFRS S2F (2.1GB). I ran format - analyze
> - test to precheck the disk, no problems.
>
> But to my horror the same problems appeared again. Every night
> during backup cycle there are about a dozen of new unreadable
> blocks on the disk, messages like
>
> sd3a: Error for command 'read(10)'
> sd3a: Error Level: Retryable
> sd3a: Block 3932032, Absolute Block: 3932033
> sd3a: Sense Key: Media Error
> sd3a: Vendor ' ' error code: 0x11
>
> Because it's a completly different disk from a different vendor
> I don't believe that it's a disk problem any more. But what
> else could cause such behaviour? The other disk in the same
> box runs without any problems, so I don't think there is some-
> thing wrong with the scsi bus. But what might be going on???
>
> Any ideas?
>
> Thank you very much!
>
>
> nils-henner
>
> --
> Nils-Henner Krueger, WHU Koblenz, Abt. DV, Tel. +49-261-6509263
> email: nhk@whu-koblenz.de



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:11:42 CDT