On my query to find out what tools were available to find out what
workstation / user is loading my server down I received the following
pointers which I have outlined:
Software Tools:
nfswatch (PD available by anon ftp from ftp.erg.sri.com)
nfsstat (Sun OS)
etherfind (Sun OS)
traffic (Sun OS (Suntools))
netstat (Sun OS)
ethertop (PD recently posted to comp.sources.unix)
NetMetrix (Comercial Software originally called
EtherView (603) 888-7000).
Things to get:
Hal Stern's book "Managing NFS & NIS".
NFS patches from Sun.
Look at:
reducing the number of nfsd started by rc.local (with tradeoffs)
lookups or symbolic link lookups
retransmission rate on the clients
I have used many of these tools before without needing them; now that
I needed them, I spaced it on using them. :) I found what I was
looking for with etherfind, looking at the source and destination
of packets. It turns out that I had users copying files from one
directory to another, via NFS. In other words, a user on a workstation
was copying from directory A to directory B, and A and B were both on
the server, NFS mounted to the workstation. This creates a lot of
traffic since the workstation CPU is administering the packets to copy,
and the server is getting the information, sending it out to the network
via NFSD, and receiving the packet back with NFSD, and placing the
information in slightly different place on disk...
This is not abnormal, as many of the database related programs we use in
CAD inherently do this same function. In one case, I found the data to be
moved 7 times across the network to process one file with several filters
or filter like programs. This particular example I was able to modify
so the data only moves across the network 1 time (Cadence plotting routine).
So with this, it points to the need of more serving power, smarter
software, education of my users on "What they are really doing" with
some tasks, and perhaps some dream software that would check for NFS
mounts and instead of doing a copy over the network, actually interpet
the command and perform a remote job.
As far as my questions on "nice", there were no responces that could
help "nice" a program further than the level 20, which is what I am
doing now with a program fired off by cron on certain user initiated
programs. I think that due to my NFS load, and with the programs that
I have on the system, there is contention for disk, and simply just
too much to do... What I need is a nice with the ability to nice to
a priority just above "idle". :)
I sincerely appreciate the fast responces and pointers to help me
isolate the traffic / load problem I am faced with. Thanks!!!
I received help notes:
>From kalli!kevin@fourx.Aus.Sun.COM
>From higgins@math.niu.edu
>From @cse.ogi.edu:bit!jayl@cse.ogi.edu
>From cypress!cypress.com!mdl@decwrl.dec.com
>From JAMES_M._ZIOBRO.WBST102A@xerox.com
>From miker@sbcoc.com
>From metrix!picasso!neeraj@uunet.UU.NET
>From @rock.db.toronto.edu:jdd@db.toronto.edu
Many Thanks and
Happy Holidays!
--Mike
Michael Willett, mike@array.com uupsi!monarch!mike uunet!csn!monarch!mike
So much fun, so little time to enjoy it...
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:22 CDT