Thanks to everyone who responded on my question about giant packets.
It seems that the problem is most likely related to bad equipment on
the network. We talked to Cabletron (our main network hardware
supplier), and they have loaned us a network monitoring package to help
us track down the problem.
Here was the original question:
We keep geting this message on one of our sun4/330 server consoles:
le0: Receive: giant packet from 8:0:20:0:f:3c
le0: Receive: STP in rmd cleared
The ethernet address varies.
What does this mean, and what can we do about it? We have been experiencing
severe network problems (timeouts, etc). Is this related and/or the cause?
And the responses:
------------------------
From: Andrew Luebker <asuvax.eas.asu.edu!eye.psych.umn.edu!aahvdl>
Do you have any micros on your network?
I sometimes see those messages when micro users run the
test programs on the diagnostics floppy shipped with
3com 3c501/3c503 boards for the IBM's and clones...
[Yes, we do and we are looking into this. - Rod]
-----------------------------
>From asuvax.eas.asu.edu!csis.dit.csiro.au!geoff
I would be very interested to get a summary of responses to your request as I have the same problem here!
It has of late "gone away", this has been since we put a bridge in
the network. Our scenerio was poor network performance, lots of NFS not
responding messages high Input/Output error rates a network average collision
rate of 3-4% but some hosts up to 12%.
Since I put the bridge in it has essentially reduced all of the above.
I'm sure that there is a physical problem on the net causing all of the above
but finding out where it is, is turning out to be a real battle.
Anyway I still have no idea what causes the messages you described
(but would like to as it may point to our problem).
----------------------------
From: asuvax.eas.asu.edu!spdev.East.Sun.COM!tgsmith (Timothy G. Smith - Special Projects)
TFM [ le(4s) ] says:
le%d: Receive: STP in rmd cleared
The driver has received a packet that straddles
multiple receive buffers and therefore consumes
more than one of the LANCE chip's receive descrip-
tors. Provided that all stations on the Ethernet
are operating according to the Ethernet specifica-
tion, this error "should never happen," since the
driver allocates its receive buffers to be large
enough to hold packets of the largest permitted
size. Most likely, some other station on the net
is transmitting packets whose lengths exceed the
maximum permitted for Ethernet.
The first message means that a packet came in that was over 1588 bytes
long. (1588 = 1536 Ethernet MTU + 4 CRC + 48 overrun space) This
shouldn't happen on an Ethernet.
The second message is a result of the first. A "rmd" is a Recieve
Message Descriptor. STP is the Start of Packet bit. The second
message means that the chip returned a buffer that didn't have the STP
bit set. The lance chip will automatically chain buffers together if
neccessary. It should never be necessary to chain buffers since the
driver always allocates 1588 byte buffers. The buffers were chained
because the frame coming in was over 1588 bytes so the chip started
stuffing data into the next buffer.
What all of this really means is that the machine is seeing long
frames. Your network is broken. Might be a broken host, xcvr,
repeater, sun spots, or whatever. Break out the sniffers.
---------------------------
From: asuvax.eas.asu.edu!sunne.East.Sun.COM!stern (Hal Stern - Consultant)
the sun ethernet drivers (ie and le) use a fixed-size buffer
for receiving packets. the buffer is big enough to hold a
single maximum sized packet. if the chip receives a packet
that is larger than the maximum defined for the ethernet, it
copies it into two buffers -- the packet sort of "hangs over"
the edge of the first buffer.
the "giant packet" received message indicates that a packet
larger than the buffer was received. the STP message (you
may also get ENP messages) is a side-effect of the buffer
overflow. each packet has a start and end of packet bit.
normally, they should both be set. but when a packet overflows
into the next buffer, that bit gets a random value -- whatever
bit from the giant packet happens to fall on it. so sometimes
the STP bit will be set to 0 - a "can't happen" with a legal
ethernet packet.
the ethernet address 8:0:20 belongs to a sun. could be that
(a) your ethernet is too long
(b) you have a branch, T- Y- or other illegal topology in
your network, particularly if you're using thinnet
(c) you have a bad transceiver
you might want to put a sniffer on the wire to see what
the giant packets look like -- are they noise or two
valid packets that got glommed together
---------------------------
From: asuvax.eas.asu.edu!NMSU.Edu!jeff (Jeff Harris)
Another place to watch out for is at people with 3Com interface cards in
their PC's. When using the diagnostic disks that used to come with the
cards (I haven't checked lately), one of the tests would cause giant
packet errors. The directions clearly state not to run that particular
test on a running network, but how many people actually read docs :-)
The test generated packets that seemed to generate packets whose entire
contents (including src and dest fields) alternated between a random
value and 0, so packet contents loooked like
AA-00-AA-00-AA-00-AA-00-AA...
When you start to parse the packet, and get to the length field, you can
obviously get some really bizzare results. And to make life more
interesting, the packets do not have valid source addresses, so they are
particularly fun to track down.
-------------------------------
From: asuvax.eas.asu.edu!amil.co.il!leonid (Leonid Rosenboim)
This is a guess because I dont have a 330 nor have I ever seen such a
message but it might help:
The maximum Ethernet packet size (including header) is 1538 bytes, as
specified by the standard. Practically various Ethernet chips can be
programmed for a larger packet size which would be rejected by the
destination node if it was not midified accordingly, but I never heard
of enybody doing such bad things.
Since the Ethernet address changes, and they look like legitimate Sun
addresses (8:0:20:...), this is a different case. I think that you have
serious electrical problems on your Ethernet wire, or in your low-level
ethernet equipment (transceivers, repeaters, fan-outs etc.). But since
you did not elaborate on the type of media you are using, there is very
little I can add. You may howevery try to deduce the source of the
problem by either:
1. Gathering the different ethernet addresses and guessing the
particular area, 2. try to replace some of the low-level stuff that you
suspect 3. divide your network into subnetworks for testing ala binary
search 4. get a LAN analyzer and do it by the book. 5. call a
consulting company for help
I would like to hear about your progress however, because I like to
help people.
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:19 CDT