I've been slow about sending a summary, since the information I collected
was pretty meager, but I've started seeing reports from others having similar
problems, so thought I'd share our experiences:
The problem:
> Recently, we swapped computers ("pizza boxes" only), keeping (basically)
> the same disk drives and monitor.
> The previous system was a SparcStation 1, running 4.1, with 2 external
> Quantum ProDrives and a CDC Wren.
> The new system is a SparcStation 1+, running 4.1.1. It has an internal
> Quantum ProDrive, one of the old external Quantums, and the CDC Wren.
> Note that we did not "install" 4.1.1, but just switched computers.
> Since the switch we have been getting intermittent write errors on the
> swap partition:
> Feb 13 08:04:07 kankakee vmunix: sd3e: Error for command 'write'
> Feb 13 08:04:07 kankakee vmunix: sd3e: Error Level: Fatal
> Feb 13 08:04:07 kankakee vmunix: sd3e: Block 992, Absolute Block: 1407
> Feb 13 08:04:07 kankakee vmunix: sd3e: Sense Key: Media Error
> Feb 13 08:04:07 kankakee vmunix: sd3e: Vendor 'CDC' error code: 0x14
> We'll get a series of these, where the block numbers are usually the same
> within the series, but change for the next series.
> In most cases the swapper seems able to recover and the
> jobs don't seem to be affected. The error is always on a "write".
> (The error code is usually 0x14 or 0x12, occasionally 0x10).
> We had no trouble with this drive before the switch.
(Additional note -- most of this disk is used for our main file system.
Only the swap partition was producing the error messages.)
I got 2 responses to this -- one person said he had experienced the same
thing after upgrading an SS1+ to 4.1.1, and was waiting for information
from the people who sold them the disk. The other person suggested that
we check our cables (good advice) and run read/analyze on the disk. We had
done these, with no edifying results.
I called Sun software support, and they spent some time on it, but felt
"Media Error" was pointing to disk problems. They asked us to try reformatting.
We did ("format" reported no defects), and decided to take advantage of the
down time to run "newfs" so we could use "I/O clustering."
We haven't seen the problem since, and we've exercised the swapper heavily.
I don't know if "format" or "newfs" was the solution (we didn't have the
experimental method in mind at the time... sorry!) but if "format" fixed
something it sure was quiet about it. Could it be that the swapper expects
the I/O clustering format? (The documentation states that filesystems and
files created under other releases are supported, but what about swap
partitions?
Hope this helps!
Sue Bergquist
Tektronix
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:12 CDT