Reply-to: Bob Hendley <R.J.Hendley@computer-science.birmingham.ac.uk>
Followup-to: Bob Hendley <R.J.Hendley@computer-science.birmingham.ac.uk>
Here is the summary for a request I posted for comments on network configuration.
I've included the original posting and verbatim replies at the end.
Thanks to all who replied.
----------------------------------------------------------------------
Our objectives are to find the right balance of configuration and to spend
money where it will be most effective. A further consideration is to provide
a manageable system. In particular this means that we don't start with a prejudice
that SLCs MUST be used as workstations because to not do so would waste cycles -
if overall performance is better that way then we would do it (after all that is what
we all do with Sun3s!)
To reiterate our current setup:
| To Campus FDDI
|
------------
| |
| |
------------
|
|
------------------ ..... -------------------------------------
School Backbone | |
| |
--------- -----------
| | | |
| | | |
--------- -----------
| |
| |
20 mainly DL 25 SLCs (DL)
and Xterms
We have Suns routing between our various networks and also the outside world. There is
one server (fat-controller) on which all local software is installed (this is also the
yp master). Other servers have a full Sun distribution and a subset of local
software (eg. frequently accessed binaries). All servers are slave yp servers. User
files are held on the server most local to the machines that that user is expected
to use most often.
So, what will we do?
Well, it's always nice to have your own views confirmed. The consensus
is fairly clearly (both from these replies and others) that we should leave the network as it is,
largely. The SS10 should go on the network with the Xterms. If it is performing well within
itself we should consider adding a second ethernet interface to put it directly
on to the network with the 25 SLCs.
Hopefully, with some guidance, students will balance their work between the local
machine and the SS10 to maximise performance. The servers can then be restricted
to true fileserving.
The received wisdom seems to be to add another 4/8M to the diskless machines, but it
is not clear we can run to this.
As a separate exercise we will try to replicate all of the local software on each server,
although I don't think you can ever do this completely!
[Ofcourse if we were starting from scratch the way forward might be very different!]
----------------------------------------------------------------------
The original posting
> To: sun-managers@eecs.nwu.edu
> Subject: Network configuration
> Date: Wed, 24 Jun 92 15:22:21 BST
> From: Bob Hendley <rjh@cs.bham.ac.uk>
>
>
> Background:
>
> We have two ethernets used by undergraduates. One has two servers and
> 25 diskless SLCs (8m), the other has 10 diskless SLCs and 10 xterms (and one
> server). The servers act as routers to a departmental backbone. The two
> networks are in different rooms and serve different purposes (so there is
> relatively little traffic that is not local to the network). We are about
> to add a SS10 as a compute server on the second network.
>
> The machines are used for software development in C, Prolog, Lisp etc.
>
> We are debating how best to configure these and other networks. in particular:
>
> . Do we leave it as it is
>
> . Do we make it one network
>
> . Could we run all the workstations as Xterms with all processes running
> on the SS10
>
> We have to make a decision soon and so cannot collect useful statistics
> on current server/network performance (our students are on vacation)
>
> Questions:
>
> I'd appreciate any advice on configuring these Suns, but in particular:
>
> . How many xterms might an SS10 reasonably support (it will be a single
> processor machine with 64M, but this could be adjusted - in time)
>
> . Are we better off with all 'real' work on a server or on a workstation
> - do we run an xkernel on the SLCs
>
> . How many machines can reasonably go on one ethernet (all xterms or all diskless
> SLCs)
>
> . Where is the bottleneck likely to be - is it the server, the servers network
> interface(s) or the ethernet
>
> I appreciate that there is no clear answer to this, but would appreciate
> peoples experiences/opinions/prejudices
>
>
> Bob Hendley
> School of Computer Science
> University of Birmingham
> UK
----------------------------------------------------------------------
From: Jeff Bacon <bacon@mtu.edu>
> . Do we leave it as it is
if it works, why not?
> . Do we make it one network
I wouldn't. you probably got enough on each net as is, IMHO. besides, why
concentrate? diffuse. it's the way of ethernet in the 90s.
> . Could we run all the workstations as Xterms with all processes running
> on the SS10
you could...but it really really depends on what you're doing. see below.
> . How many xterms might an SS10 reasonably support (it will be a single
> processor machine with 64M, but this could be adjusted - in time)
> . Are we better off with all 'real' work on a server or on a workstation
in terms of simply the machine, it could probably handle the 45 users
represented by the 45 machines well enough. keep in mind though that I
presume a decent i/o setup - probably dual scsi buses and well thought-out
partitioning. also, you'd want to put a second and maybe third ether card
in the ss10 and use it as the router for both networks.
depends on what is happening in the lab though. is it one of those ones
where everyone is doing the exact same thing at the same time ("Now, students,
type 'cc filename.c' and hit return.")? random usage, homework only?
upperclassmen running massive projects?
we have quite a problem with that in one of our labs; from 8 to 5,
there are consecutive 2-hour labs in there, using SDRC I-DEAS (diskful
ss2s, one software server). on the bell, everyone walks in, sits down,
logs in, starts sunview, and types "ideas", whose first executable is
15MB and they get bigger. massive point loads. chokes the network to hell,
but the server could care less cause it has enough memory that it just
maps ideasmn.exe into memory and hands it out 15 times. (used to be
worse though - the lab used to be 8MB 3/110s and 3/140s, 8 DL off a 3/260,
the rest root/swap local, with a 3/160 serving software. they'd run those
labs in there then, and it would take 10-15 minutes to get everyone going,
especially on the DL ones...)
isn't it wonderful how many variables there are to consider?
also consider: do you wanna bog down your shiny SS10 with everyone's hello.c
work? or save some of it for those really big projects that the
best-and-brightest are always doing? especially when you have a whole bunch
of 12.5MIP processors that will otherwise be spinning pretty much idle? I
would opine that this is a golden opportunity to set up a nice
high-horsepower compute server reserve. (just like military combat.)
> - do we run an xkernel on the SLCs
waste of an SLC probably. that's the bugaboo. for editing/WP/light compiling,
your DL SLCs are probably fine as is (assuming you run MIT X or sunview -
but then again running openwindows on an 8MB DL SLC at all is just not
a profitable idea - take my word on this).
also, there's the real big plus that, if your student is just sitting
there vi'ing (or WP'ing, etc) on a DL SLC, that machine isn't likely to
need to swap a lot and thus you're not making a lot of traffic. an xterm
would be going at the network like all hell in that case. course, the
scenario is a bit reversed for compiling...so, what do you end up doing
more, in terms of the overall, compiling or editing? I'll bet it's editing.
your lisp jobs are likely to be quite a chunk to swallow for those SLCs
though. but that's a variable. are we talking real lisp work here? or
just people fiddling with it to get a taste? and how many people use it?
or is C the predominant language used? (probably.)
I'd say keep using the DL SLCs as is and save your SS10 for heavy
compiling/large run jobs, and probably route your lisp there too if
you can and it's big enough.
believe it or not, though, in the end, given free run of things and a little
guidance, I think you'll find that your users will balance their usage out
for you. they're not all dumb. (and sometimes, they're dumb enough to
be happy with whatever also. :) )
> . How many machines can reasonably go on one ethernet (all xterms or all diskless
> SLCs)
we run 20 DL SLCs off 2 servers on one of our subnets. that's probably
about enough, maybe a few more. xterminals? dunno, ain't got none. depending
on what you're doing (heavy char i/o? heavy graphics? things just sitting
around a lot?), I would assume about the same.
IMHO, I'd say err on the low side - configure too much network.
you'll use it someday, trust me. you're set up in two with 20 and 25,
you're probably happy. and there's no point combining when you're already
split and there's not much reason to cross-talk.
> . Where is the bottleneck likely to be - is it the server, the servers network
> interface(s) or the ethernet
I'd bet on network. character i/o on an xterminal will kill you, I'd say -
lots of packets being generated all the time with almost nothing in them.
Your servers can be pretty well-tuned to take an awful lot. and the
Sun lance ethernet interface seems to be one of the best around in terms
of i/o cabability. (especially the on-board one - it's been said by sun-type
ppl that the on-board one is considerably faster, cause it's on the cpu
board not inches from the CPU itself - doesn't actually have to talk thru
the SBus - massive throughput.)
oh yeah. secondary intel ethernet VME cards supposedly suck rocks. so
if your router/servers are VME-based, you probably oughta have the
local network attached to the on-CPU-board interface.
so, the moment of truth, bacon's suggestions:
put the SS10 on the second net. hang the xterms off it. use the one server
there for the 10 SLCs. leave the second net alone. tune the hell out of
the servers as pure file servers and keep ppl off them. encourage your ppl
who need the SS10 compute power to use the xterms. if you're planning
on a good bit of usage of that SS10 from the other network, consider
popping a second ethernet in it and hook it right into the other net.
(you could do that anyway and use it as a router as well, and take some
load off your fileservers, if it's physically feasible. *shrug*)
there's also the issues of:
- is it all one yp domain? consider the YP traffic flow. it can
be considerable.
- where are the home dirs? make sure that stuff's routed right.
- tmpfs for /tmp is probably a really good idea.
- if you can, get more memory for your SLCs, your users will love you.
the step from 8 to 12 is noticeable.
- if your machines have multiple disks, watch the traffic flow, make sure
it's evenly split amongst disks. if you have a lot of disks, consider
add-on scsi buses - they're dirt cheap ($250 educational from sun?)
and can buy you a lot on busy file servers.
the moral: the tuning of your machines can be as important as the
configuration of your network. cover all the bases.
hope I haven't overwhelmed you. good luck with your new ss10. wish I was
getting one. :)
-- = Jeffery Bacon General Systems Hack - C&EE, ME, ChemEng/Chemistry, MTU = = bacon@mtu.edu, bacon@mtus5.bitnet ph-(906)487-2197 fax-(906)487-2822 =----------------------------------------------------------------------
From: Mike Raffety <miker@sbcoc.com>
Stay distributed ... you should probably even consider adding disks to those diskless machines, at least to provide for a local swap.
No way you're going to move all that processing to one SS-10, and even if you did, you'd have a lot of wasted CPU cycles, using all those workstations as dumb X terminals.
You've got enough machines you need to stay on at least two networks. If both need to access the SS-10, buy an Ethernet card for it, so it can be local to both networks.
----------------------------------------------------------------------
From: Eckhard.Rueggeberg@ts.go.dlr.de
. Could we run all the workstations as Xterms with all processes running on the SS10
Perpahs you could, but it would not be efficient if you let the compute server do what the workstations could do theirselves. Instead, I would suggest to upgrade the SLCs' memory to reduce load on the compute server and traffic on the network (or even upgrade them to ELC which can hold up to 64 MB).
. Do we make it one network
No. Leave it as it is if you don't expect too many users from the other network use the compute server with network intensive jobs.
----------------------------------------------------------------------
From: fsh@csres.rdg.ac.uk ( Steve Han)
We at Reading had a similar setup. Our solution was to subnet our various domains so that we now have the following subnets:
A. Undergraduate subnet with 20 SLC and two servers (SS1+ & SS2). We now plan to increase this number to 30 seats. B. Undergraduate subnet with a small number of Sun 3 (mainly used for robotics work). C. Postgraduate subnet with a mixture of SLC, ELC, SS1, SS1+ and IPC. D. Contract research subnet with SLC, SS2, SS1+. E. Contract research subnet with 670MP, SLC, IPX, SS2 (and soon, an SS10-54). F. Staff subnet with SLC, ELC and Sun3 in staff offices. G. PCNFS subnets for 80386 PCs.
We are fully networked to the Computer Services Centre (CSC) via fiber optic link and our undergraduate filestore is on their Amdahl UTS. Our CSC will shortly move everyone onto their recently acquired Auspex. Our staff and postgraduates have filestore on Computer Science disks. They also have a separate accounts with CSC so that their filestore is on the UTS/Auspex.
Our experiences are listed below:
1. LOCAL DISK ON SUBNET: Undergraduate filestore on CSC mass storage disks. This causes excessive ethernet traffic bottleneck when all our students want to access their filestore from our undergraduate subnet with SLC's (A). We are hoping to install a 1.3GB local disk in our own subnet for our undergraduate filestore so as to minimise traffic from outside our subnet, but this is running into some "politicking" difficulty with CSC. 2. INCREASE MEMORY ON SLC's: Our applications in undergraduate subnet with SLC's (A) are mainly C, StP (Software Thru Pictures casetool), ATT C++, gnu g++, Poplog 14.1 (Pop-11, Lisp, ML, prolog), LaTeX, X11R5, Openwindows 3, Orwell6, BTOOLS, poplogneural, poplogflex, poplogrules and hips. All these work fine for our existing 20 SLC's, however, with an increase to 30 seats, there may be some performance degradation. Our systems analysis would suggest that the bottleneck lies in local memory, viz. our SLC's only have standard 8MB memory. Funding permitted, we hope to upgrade this to 16MB per se. 3. KEEP SERVERS: From our analysis, the servers are not stretched (using etherfind, netstat, vmstat and performance meters). They merely provide swapping for the diskless SLC's. Incidentally, we do not allow our undergraduates to login to our two servers on this subnet. This way, we have increased security and also all "real" work is done on diskless clients and not on the servers. ADVICE: 1. Make all undergraduate hosts on one or two subnets (max: 32 hosts per subnet, i.e. 30 SLC's or xterms excluding two hosts). This would reduce traffic and provide more security and make life easier for your systems administrator staff. Also, if you have not done so already, install security full and security password on the SLC consoles (which should also be deemed insecure). Remove global r-x permission on /usr/kvm/crash too. Our undergraduates have 24-hour access to the Lab containing undergraduate SLC's. Security is a big issue here, and we have, more or less, closed most holes in the system. 2. All "real" work should be done on workstation. After all, this is the era of intelligent client server environment.
No doubt, you may disagree with the above, and you will nevertheless receive contradictory advice, experiences, etc. I think everyone on the net would appreciate a summary of your responses. Looking forward to receiving it on Sun-Managers.
Best wishes and regards.
----------------------------------------------------------------------
From: cyerkes@jpmorgan.com (Chuck Yerkes)
The machines are used for software development in C, Prolog, Lisp etc.
We are debating how best to configure these and other networks. in particular:
. Do we leave it as it is
. Do we make it one network
. Could we run all the workstations as Xterms with all processes running on the SS10 Why waste a SPARC CPU to be an XTerm?? XTerm are positioned to be half the price of diskless machines. With Diskless machines, all the server does is provide NFS services. The computation is done locally by a fairly powerful computer. Xterms on the other hand, simply act a graphic heads to the computer. The XTerm does nothing locally (except drawing graphics) and is a burden on the server.
We have to make a decision soon and so cannot collect useful statistics on current server/network performance (our students are on vacation) You can grab stat's later and change details later (like a bridge) and get a reasonable setup.
Questions:
I'd appreciate any advice on configuring these Suns, but in particular:
. How many xterms might an SS10 reasonably support (it will be a single processor machine with 64M, but this could be adjusted - in time) Unknown to me
. Are we better off with all 'real' work on a server or on a workstation - do we run an xkernel on the SLCs NO NO NO!! This is entirely counter productive!
. How many machines can reasonably go on one ethernet (all xterms or all diskless SLCs) I have 60 machines (16 XTerms, 6 SLC's and a bunch of dataless workstations ( Also 3 NeXTs being served by another NeXT) on a divided ethernet (the server lives on both sides). .. We use Frame and Wingz (HEAVY X Traffic) and compile everywhere.
. Where is the bottleneck likely to be - is it the server, the servers network interface(s) or the ethernet It really depends -how much compiling V. editting is there? That's statistics gathering (the SS10 is unknown to me).
----------------------------------------------------------------------
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:06:44 CDT