Thanks to all who replied. No consensus, but several possibilities were
suggested....
Paul.White@sun-microsystems.co.uk
stern@sunne.East.Sun.COM:-
More NFS daemons. I upped the # of daemons to 32 on one server. It didn't make
a difference that I could see from "Wiretap".
Several people suggested:-
Prestoserve. The servers are "read-only" so this won't help.
(I should have stated that more clearly)
Ether load. This may be a factor.
poffen@San-Jose.ate.slb.com
davis@udecc.engr.udayton.edu:-
Subnetting might help. Yes. See below.
Look at PC packet sizes - maybe there's a lot of retrying going on if "netware"
is involved. We have just one network, two servers plus the PC's on the net
we have been experimenting on. PC default packet size is 8192.
-------------------
Network loading seems to be significant. We can provoke problems by
simultaneosly booting several PC's. So we are looking at tftp. It seems to have
a retry problem. The PC retransmits its last "ack" after a time-out time set in
the PC rom which seems too low (for our servers). This provokes a problem with
escalating retransmits. We are trying to re-write the tftp server-end code to
stop this (anyone been here before?). We have also subnetted, and made
some improvemnt. We still see puzzling delays when the PC
performs a mount which we cant explain. We hope to look at this more closely
with an ether monitor.
Thanks to all respondants. I think this problem is going to run on. We are
probably trying to do too much with low-end servers, so the final solution will
probably be to spend money - more servers, more subnets!
The original question...
Here's a performance problem I'm having on an NFS server that's serving
DOS/Windows S/W to PC's. It also acts as a boot server. It's a Sun Clone
(Solbourne S4000) which is a bit like a SS1+ (so I believe). It runs under
OS/MP4.1A.1 which is a bit like SunOS 4.1.1 (so I believe).
This system started off with 16Megs of memory. It started groaning when we got
to upwards of 80 PC clients, and the volume of software being served
got to approx 2Gbytes.
I looked at vmstat and nfsstat for a bit and decided that it needed more memory,
so I took the only available course on this hardware and changed the bank of
8 x 1Meg SIMMs to 8 x 4Megs. With 8Megs hard wired it then had 40Megs. It
didn't improve things.
I then noticed from vmstat it usually had 20-25Megs free, even under heavy
load. So I did two things:
1. Upped MAXUSERS to 64 in the config file
2. Changed vmparam.h to use Sun's paging params of 9/24/90,
supposedly being better for "larger" memory systems. see comments
in SunOS vmparam.h
That got it to use memory better (vmstat free went down, and disk I/O was very
much reduced), BUT....... NFS response time didn't improve - it got worse if
anything.
I read Hal Stern's (Excellent) book on "Managing NIS and NFS", which suggested
that I cut down the number of nfsd's running. I reduced this to 4. It didn't
improve things.
I got a free trial of "WireTap" from AIM Technology, so I can compare the
server with a similar server running with only 16Megs and no parameter
changes. The "small" server thrashes it's disks, and pages in more but manages
a marginally better NFS response (typically 6-12Msecs) at the same load
(typically 60 NFS ops/sec). Ether load peaks at 25%, typically its 10-20%.
The "big" system is noticably faster for commands on its console when load is
high - but that's not what I set out to fix.
The problem doesn't seem to be lack of CPU capacity, although performance dips
do occur when, now and then, system CPU time gets above 80%. The "small" server
seems more prone to such dips.
Anyone got any suggestions?
-------------------------------------------------------------------------
Gordon Roberston, Head of Systems, Aberdeen University Computing Centre
Tel 0224 273340
E-Mail : g.robertson@abdn.ac.uk
--------------------------------------------------------------------------
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:08:30 CDT