SUMMARY: Token ring woes.

From: Neil Rickert (rickert@cs.niu.edu)
Date: Thu Aug 11 1994 - 16:53:48 CDT


I asked:

  I have a Sparc 1000, running Solaris 2.3, with a token ring card.
  About every one or two days it logs a message "tr0 soft error", and
  the token ring dies. The only way to resolve the problem seems to be
  to reboot. The failure occurs at random times of day, and does not
  appear to be related to any particular process.

I received three suggestions:

 -----------------------------------------

    Hard-code the active interface by adding the following line to your
    /kernel/drv/options.conf file:
    
    cable-selection="tpe";

 -----------------------------------------

   this is what you need on a SS1000/2000:

   Token Ring Board P/N 501-1923-02 (w/ LSI DMA+ chip not AT&T chip)
   TRI/S 3.0.1 software
   Patch 101751-01.

 -----------------------------------------

   run snoop to restart tr0

   /usr/sbin/snoop -d tr0 -c 50 > /dev/null 2>&1

 -----------------------------------------

I tried each of these solutions.

I am not sure about the first suggestion (an entry in options.conf).
It is difficult to be sure whether it improved the situation or not.
It did not completely solve it. My next failure came after three days,
which was longer than average. The ring also recovered from a problem
during that period -- it had only once before recovered by itself from
a soft error.

Patch 101751-01 certainly helped. The system ran for about 6 days
without token ring problems. During that time it logged a number of
successful recoveries from ring errors. NOTE: This patch only
applies to S1000/S2000, and requires the correct version of the TR
card.

After running well for a little over days, the ring failed. No message
was logged, but traffic stopped. As I was logged in at the console
during the failure, I tried the 'snoop' command. After the snoop, a
test with 'ping' showed that communications had been restored. Of course
this should be the snoop which came with the TR software, rather than
the original ethernet-only version that came with solaris.

I guess my final conclusion is to keep the patch in place. Previously
I had been running a script from crontab to look for network problems,
and reboot if the ring failed. I will now modify that script to attempt
to restart by using 'snoop'.

Many thanks to:

        Phil Barr <pbarr@metronet.com>
        Kenneth.Erickson@Corp.Sun.COM (Ken Erickson)
        "Willard F. Dawson" <wdawson@crl.com>



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:09:08 CDT