SUMMARY: Scaling Query: (4 x 450) vs (2 x 900)

From: Tim Chipman <chipman_at_ecopiabio.com>
Date: Mon Nov 25 2002 - 17:43:01 EST
Sorry for the late summary ; many thanks to all those who responded (see
below). Here is a semi-concise overview of feedback received on this
topic ; exact quotes follow, then my original posting @ the end of it
all.

Many thanks to: (in no particular order :-)

Hendrik Visage
Joe.Fletcher
Kevin Buterbaugh
Adam Levin
David Newton
Cristophe Dupre


-----Tim Chipman.


General Concensus / Themes:
=-=-=-=-=-=-=-=-=-=-=-=-=-

-> Don't just "assume" that I/O isn't (or is?) a bottleneck ;
characterize your usage requirements as much as possible in advance, to
facilitate rational decision making in capacity planning. (various
capacity planning tools recommended {below} for this purpose)

-> scaling for 2 vs 4 CPUs will depend heavily on the type of work you
are doing, usage profiles by client apps. ie, relatively few
simultaneous CPU-heavy queries with minimal IO requirements scale nicely
to dual-fast-CPU hardware ; many parallel queries (and/or IO-Intensive
queries) will be better serviced by more CPUs (4+) and of course if
possible, optimized I/O layout (good bandwidth, multiple buses used for
moving data, etc).

-> for CPU limited functions, Sun hardware probably isn't the best way
to get the most "bang for your buck" ; some comments were made WRT the
performance of UltraSparc III vs other CPUs ('833 alpha', '2400+
Athlons', etc). [However, clearly this discussion cannot focus purely on
bang-per-buck or the issues get a bit skewed.]

-> WRT Scaling , SMP in general: Theoretically, it is suggested that
2x900 will be more efficient (*very* slightly?!) than 4x450 at crunching
CPU-limited jobs, due to somewhat simplified scheduling issues ;
however, the general concensus is that SunSparc gear does scale quite
linearly / very smoothly. Cache hits / utilization, memory use profiles
will have an impact here clearly as well, depending again on "whatever
it is that you are doing".

-> In general, the perception of "sun is good at DB servers ; X86 gear
may not be" was iterated more often than not (again, with disclaimer,
"what you are *really doing* determines everything"). However! One
example of the latter (not! former :-) was comments RE: migrating a DB
environment from legacy-sunsparc gear to legacy proliant gear running
Suse Linux (ie, from a e450 2x400, 2 gigs, internal HW raid --> Proliant
2x550 Xeon/ 1.5 gigs/ HW "SmartRaid"array) which dropped runtime for an
identical "routine DB batch query job" from 6h to 1.5h ; similar
migration to an Alpha DS20e gear (2x 833MHz, 2GB RAM,Tru64 5.1) yielded
runtime of ~ 45 mins ; in all cases, it was suspected in part that large
speed differential here was due to characteristics of the HWraid
controllers involved) [?flies in the face of the I/O X86 bottleneck
issue, maybe?]. However, this example helps reinforce the idea, know
your requirements / plan accordingly.

-> As for my topics WRT Linux:  the above case was the only one provided
with real experience for a direct migration from SunSparc to Linux ;
many other respondants suggested they didn't trust bandwidth of X86 gear
; prefer SunSparc given the choice ; believe SunSparc is more suitable
for Enterprise deployments ; some respondants did acknowledge that
Athlon/Solaris X86 could be a nice compromise of inexpensive, Powerful
CPU with stability, benefits of Solaris environment (robust, stable,
etc).  Others also suggested that if X86 was desirable, that *BSD type
environment might be more reliable / performant on the same hardware.

-> Clear concensus WRT X86 gear users was, "don't skimp, get nice gear -
it is cheap (comparatively) anyhow" and, "if really paranoid, buy 2 of
everything - it still is `cheap'".

-> My own experience with Solaris/X86 or Linux/X86 on athlon gear cleary
suggests to me that for environments where CPU-bound (not I/O or memory
bound) jobs are a big requirement, then Athlons are very hard to beat
WRT power for a single CPU / dual CPU platform. Clearly, it isn't
currently scalable past 2-CPU in a single SMP box (unless you are lucky,
as am I, to have PVM / MPI aware apps that scale onto
beowulf-type-clusters nearly linearly :-) [but this is a totally
separate topic, although related slightly], but it is clear from
responses I recieved, coupled with my own experience, that this is a
viable platform for certain work.

-> Finally, despite the general naysayers, I haven't actually got much
real info. regarding how the "poor I/O bus" of X86 gear actually impacts
I/O - bottleneck deployments, so I'll be trying to find more info on
this front.  The advent of X86 commodity 64-bit PCI, slick hardware raid
controllers (such as 3Ware "storage switch HW raid"), and Ultra160 SCSI
.. doesn't appear to be terribly well documented from the perspective of
people trying to evaluate DB performance (or other performance, either
CPU, I/O, or .. blended .. bottleneck requirements) on (Sol/Lin/?? on
X86)  and the like. ie, there is (I suspect?) a lot of bad rap lingering
with X86 from ?486? / Pentium days, from what I can tell.  We shall see.
The quest continues :-)


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
"Truly Wordy" Section Begins, for the absurdly interested? :-)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-



Text From Respondants (not! (too much :-) edited by yours truly)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


Here's a short answer for you... I look forward to seeing your summary
because this is an interesting topic.

I have worked in a WinTel world for a long time and have always seen the
faster processors=better performance scenario.  What I have learned
about Solaris is that more processor power does not equal better
performance and a Solaris server is more often memory constrained than
processor constrained.  The new Sun Fire servers support far more memory
than the low end enterprise line.  A V480 supports up to 32 GB of RAM
and a E420 only support 4 GB of RAM.  You might want to take this into
account as you determine your environmental needs.

Good luck,
...

=-=-=-=-=-=-=-=-=- 
Hi Tim,
I have very limited experience with database (mostly lightly loaded 
servers, hence not applicable in your case).However, I have plenty of 
experience with scientific application (serial, parallel and
distributed).

That said, here are my 2 cents.
For your part 1,we can *probably* assume that your dual US3 will be at 
least as fast as the quad US4, provided your server doesn't spend 
significant time waiting on external events (like database clients). I'm 
afraid this will not be your case. The idea is that when waiting for 
answers from a client, the thread will do some level of busy waiting, 
which uses one processor but does absolutely no work. For this kind of 
load more CPUs is better (classic case is a news server where 4 slow 
processors are better than 2 really fast).

Overhead for scheduling is largely irrelevant, there's no difference 
when going from 4 to 2.

Part 2.
Regarding Linux, there's one major detail that *may* impact performance: 
  the Linux kernel is not as reentrant as Solaris'. This means that 
multiple processors may not, at the same time, interact with the same 
subsystem (be it disk or network). So for now, have a dual CPU box to do 
network pushing (FTP server, web proxy or firewall to give examples) is 
completely useless. This has changed in 2.6/3.0 but I wouldn't run a 
critical server on that version yet.

The other thing is that x86 hardware usually has a much smaller memory 
bus than SPARC boxes. I assuming that your database server is using tons 
of memory for cachingand accessing it all on a fairly continuous basis. 
This is very similar to my scientific applications. Our experience here 
is that x86 hardware is not always able to provide enough data to keep 
the CPU busy. The P4 Xeon is good, the chipset can feed both processors 
busy. The P3 is real bad, unless you have a Xeon. My understanding (no 
direct experience) is that the Athlon weak point is memory bandwidth, so 
I would stay away from it for SMP use.

Hope this helps.

=-=-=-=-=-=-=-=-=-

On Mon, 4 Nov 2002, Tim Chipman wrote:
> comparable for my purposes probably. In all cases I'm assuming that I/O
> is *not* the primary bottleneck, ie, that CPU usage is more of a
> bottleneck than I/O (although clearly both are related to performance in
> this sort of environment).

Don't believe it.  You'd be surprised where bottlenecks can crop up.  If
you already have a machine running, get the SE Toolkit and Orca, and
graph
the performance for a week or two so you can see the trends.  At least
use
sar for a while.

http://www.sun.com/sun-on-net/performance/se3/
http://www.orcaware.com/orca/

> -> assume you scale a DB server from a quad-CPU-UltraSparc-II/450mhz
> (e450 calibre box) to a dual-CPU-UltraSparc-III/900mhz (v480 or v880
> -do we expect performance to be faster on the 900mhz CPU system, since a
> given (unit of work for single unthreaded process) can be ~2x greater
> for a 900 vs 450mhz CPU?

It depends on the *type* of work.  If you're dealing with non-threaded
image processing, the dual 900 will be faster.  If you're dealing with
multi-threaded image processing, they may be around the same.  If you're
dealing with databases, it's going to depend on how the database uses
its
disks.  If most of the delay is spent in I/O wait, faster CPUs aren't
going to help you.

Also, by going to fewer CPUs, even if they're faster, you may end up
with
more mutex contention, I/O or otherwise.

> -do we expect greater capacity for multiple parallel independant
> processes on the quad-cpu machine, since there are physically more CPUs
> available to do work at any given instant? (or, does task scheduling
> more-or-less render this transparent / irrelevant? Is there "more
> overhead" to process/thread management on SMP systems with more CPUs, or
> is this more-or-less constant overhead? Linear overhead scaling with CPU
> count?)

I believe the SMP scaling is virtually linear on the Suns, so I wouldn't
worry about that.  CPU speed is only valuable if the subsystem speeds
(network, I/O, bus, etc) can keep up.  If you've got really fast disk
access, the dual-900 may be faster because you can finish the task
faster.
However, if the disk access is about average, you may have better luck
with quad-450s, especially if you have a lot of little requests as
opposed
to few big ones.

> Part:2: (more blasphemous in nature :-)
> --> do we expect the answers from (Part:1:) above regarding
> quad-slow-CPU to dual-faster-CPU scaling of a server platform to have
> any relevance?

Yes, because CPUs don't work in a vacuum.  Your assumptions are biased
and
not necessarily correct, and therefore the question can't really be
answered.

Suns have an amazing capacity for throughput, both network and I/O
based.
I firmly believe that a quad-900 Sun will run circles around a
dual-athlon-1600.  Depending on the application, as discussed above, an
8-way 450 may be even better.

There are other things to consider, also.  If you get a dual-900 and it
*can't* keep up with demand, you can easily add two more CPUs if you get
the correct chassis (assuming you're not using the x000 series machines,
but rather the v880/e480 type machines).  The quad e480 can't be
upgraded.
If you go with a 4800, you have many more options.

> --> has anyone actually migrated a production database from e450 calibre
> (quad-450 UltraSparc II) gear over to (linux X86 Dual CPU gear as
> described) and found the performance to scale (poorly / decently /
> nicely?)

Honestly, I wouldn't bother.  *If* I were going to a production
environment on x86 hardware, I'd look hard at the *BSD offerings.  I
love
linux and I run it on my desktops at work and at home, but I wouldn't
use
it for production unless I had a lot of little boxes (like front-end
web servers).

> data bus, etc) (3) Solaris vs Linux OS platform issues. However, in
> part, I'm hoping to elicit comments in these veins from people who have
> actual experience with such a migration / with similar server
> functionality on the different platforms discussed here.

We have seen some similar issues when we upgraded from an 8-way
UltraSPARC-170 machine (an Enterprise 4000) to a 4-way E4500 with 400MHz
UltraSPARC II's for our Oracle work.  The UltraSPARC II was such an
advance over the 170, though, that it's not really apples-apples.  We
quickly upgraded to six CPUs, because there were so many little queries
that we needed more CPUs to parallelize the ops more -- we were mutex
bound waiting for ops to finish.  With 6 CPUs, things are better.  Given
the choice of lots of decent CPU vs. a few blazing CPU, for Oracle I
will
generally take lots of decent CPU.

> Also, clearly, I realize that X86 gear is NOT as scalable for a pure-MP
> system as UltraSparc-based equipment; however, this isn't my concern,
> especially since (last time I was aware?) the bulk of functionality
> served by Sun/UltraSparc based gear is NOT in 16-CPU (and up) MP boxes
> but rather in (gasp! single), dual, or quad CPU -based systems.

Yeah, but when it comes to *databases*, you want the big guys.  We've
got
that E4500 with six CPU, two D130 three-disk units, two D1000 twelve
disk
units, and 8GB of memory, and we still need to be faster.  If I'm going
to
upgrade that box, I'm going to *at least* six 900MHz in an E4800.

Hope this helps,


=-=-=-=-=-=-=-=-=- [further discussion ensues with above poster, dialog
semi-follows]

On Tue, 5 Nov 2002, Tim Chipman wrote:
> -partially in my posting, I indicated the CPU bench info purely from my
> experience, with 3 particular "bioinformatic" apps that are (1) commonly
> used, (2) available for both solaris-sparc, solaris-intel, linux-intel
> platforms.

Heh, just remember, an athlon machine isn't "linux-intel".  :)

>       -my testing with these 3 apps, in the type of usage that is appropriate
> for me & my environment, showed clearly:
>       -X86 hardware did run rings around the Sparc hardware, ie, a 1600mhz
> athlon *was* slightly better than 4 x 400mhz Sparc-II-400mhz CPUs
>       -dual-athlon-1600mhz *did* outperform by more than two-fold our e3500
> with 4 x 400mhz cpus
>       -this pattern was consistent for all 3 of the "bioinformatic" apps I
> was testing .. which are indeed entirely CPU-bound, for the most part.

I guess it doesn't really surprise me that for entirely CPU-bound ops,
the
Athlon wins.  That Athlon is an incredible processor.  I'd be curious
how
it does against the UltraSPARC III, though -- since the Athlon and the
US3
are basically the same generation, while the US2 is a generation behind.

...

> -that info was included purely to make the point, that X86 CPUs are not
> inherently "junk" (and thus inappropriate for "server deployment") which
> some folk in the solaris/sparc universe seem to feel (?)

I think we have to make the distinction between the hardware and the
software.  Two points:
1) There's nothing inherently wrong with x86 hardware and CPUs, as long
as
you spend the money to get the good, high end PC hardware rather than
the
cheapest junk you can find in computer shopper.
2) The Solaris kernel is, I think, the single best Unix kernel currently
available.  I wouldn't trust my Oracle box to a linux kernel, despite my
liking linux very much.  For a production Oracle box, I would run *BSD
before linux, but my preference is for Solaris, which is what we have.

Ok, one more point: Solaris hardware support is *top notch*, but you pay
an arm and a leg for it.  :)

> -I won't go deep into benchmarks for when I get the dual-athlon system
> is working in a beowulf cluster running MPI or PVM - cluster - aware
> software. Basically, it scales linearly, assuming "moderately difficult
> jobs" that are "appropriate for cluster work". A cluster with
> dual-athlon head-node and a dozen slave nodes of celeron CPUs at
> 800-1300mhz per node, it is a very nice scaling profile. For this kind
> of CPU intensive work. Hence by belief that for this kind of CPU
> intensive work, X86 hardware is of immense value. My guess is that I
> need a ~24-CPU sparc box to outperform my beowulf cluster. Last time I
> checked, this 24-way-sparc-box-price tag is far in excess of 10x the
> cost of my $10,000 (canadian $$) beowulf setup.

Guaranteed -- a 24 CPU Sun machine is going to probably run close to a
million bucks, once you get done with the CPU boards and the memory for
it.

Of course, clusters are an entirely different animal -- I hadn't noticed
mention of clustering in your original post.  I've not played with
Beowulf, but I understand that it's getting quite stable these days.

...

> -> ultimately, in my posting, I was trying to get a a feel for the sparc
> scaling issues for a DB-server on which I know we do observe CPU as the
> bottleneck more than IO currently. (The DB server is an e450 with
> t3-disk array Raid5 storage. When we upgraded storage from A1000 storage
> to T3 storage, we did get some performance increase, but very! minimal ;
> typical observation of this DB server often shows single Oracle
> processes running @ 100% of one CPU for > 2 minutes ... and total
> available CPU at 0% ... for extended periods. IE, we're looking at
> comparatively few simultaneous, complex, CPU-intesive queries, as far as
> I can tell. IO wait states during these times are always insignificant
> (10-20% max)

That's good info right there.  If you're seeing 100% CPU for two minutes
with low IO Wait, that's definitely CPU bound, and you'll benefit from
US3
900MHz processors.  If you can, I'd upgrade to the *same number* of
processors, rather than halving them, but money's always an issue, isn't
it?

[TDC aside / interjection - also suggests Athlon2000+ would work quite
well IMHO?]

You'll need to get a number of CPUs that jives with how many oracle
processes pin at 100% at a time.  If you only ever see *one* pinning a
processor, then two 900MHz might work just fine.  I'd play it safe with
four, given the choice.

> -> Hence, my suspicion that for this kind of Oracle usage profile, a
> dual-athlon-linuxbox might indeed be a huge performance increase over a
> quad-400 e450 sparc box.

My gut feeling is that your suspicion is correct.  I'd be curious what
the
US3 processors can do for you, though.  Of course, you might want to
consider top-of-the-line Athlon 2.4GHz instead -- those should be faster
even than the best SPARC processors Sun's got right now, as long as the
rest of the hardware can follow suit.  The key is going to be how the
Linux internals handle the load, too, but it seems like that's working
quite well based on your measurements.

....

<end more-or-less-of-dialogue, in this context :-) >

=-=-=-=-=-=-=-=-=-

SUN actually do a very good job of hiding the fact that 
UltraSPARC really isn't that quick. Take a look at 
www.specbench.org and look at any of CPU2000, 
SPECweb and SPECjjb99 and you will see what I mean. 
The UltraII has been a dog for years. Even the UIII is 
nothing special. For example I was asked to run a little 
computation test for some of our app developers. Took 
a minute to run on a V880-750 and a DS20e-883 Alpha 
does the same thing in about 25seconds. The 883 is 
now only midrange Alpha. Granted a lot of the Intel 
stuff is crap at shifting data around internally but some 
of it (I'm generally talking Proliant cos they are what I 
know) are more than competent and can easily 
outperform anything like an E420/450 for a fraction of 
the cost. It's only now the SUNfire ( eg v480) stuff is 
getting affordable that things are levelling out.

Obviously where databases are concerned SUN appears 
to have an edge but I suspect that's largely because 
the budgets for the disks and controllers tends to 
provide for better I/O than you normally find on an Intel 
box. Not many people play RAID 0+1 with striped 
volumes on Intel. The fact that I did in my tests shows 
what the stuff is capable of if given a chance.

What the commercials do still do better is high 
availability, clustering etc. I think they are also much 
better in terms of overall manageability than linux.

For big systems I'll stick with SUN/HPAQ etc all the time.
For a cheap and cheerful yet perfectly adequate system 
pick up a second hand DL380 for a couple of grand, sort 
the disks out and you have a hell of a DB server.

=-=-=-=-=-=-=-=-=-
Hi,

Took an Oracle 8.1.7 Db on an E450 (2x400, 2GB RAM, 
Solaris 8) and ported it to a Compaq Proliant (2x PIII-
550, 1.5Gb RAM, SuSE 7.1) and cut the job runtime 
from 6hrs to 1.5 hrs.

Same thing on an Alpha DS20e (2x 833MHz, 2GB RAM, 
Tru64 5.1) took about 45 mins.

Differences? Proliant had an internal RAID controller 
which allowed a better disk setup. The Alpha is just 
waaaaay quicker than an Ultra-II even using RAID5 disk 
volumes. You should have seen what it did to our java 
code compared to the SUN.

Still, these days I'm using fully packed V880s and they 
seem to get the job done. ;-)

Cheers,
...

Just FY,

My Intel DB server box was a Proliant 3000, dual Xeon 
550MHz, 1.5 Gb RAM, Compaq SMART Array 4200 RAID 
controller, 3x18Gb internal  plus 6x9Gb Universal drives 
in an external cabinet, SuSE 7.1 with LVM. Used one 
18Gb internal to store Oracle binaries. The 6 disks in 
the external array were set up as 3 mirror pairs then LVs 
were striped across these.

Cost of the hardware was #1500 2nd hand. #75 for the 
SuSE kit plus some for a RAM upgrade. Total about #2k.

Cheers

=-=-=-=-=-=-=-=-=-

Tim,

     Interesting questions!  I'm really looking forward to your summary!

     Here's a couple of thoughts I had as I read your e-mail.  First, I
would expect performance to be better on a 2x900 MHz CPU box as opposed
to
a 4x450 MHz box for CPU intensive applications.  The reason for this is
that the 900 MHz CPU's can do more work in each time quantum. 
Therefore,
the process is not going to get context switched off the CPU as often
(which also keeps the cache warmer), and therefore finish more quickly.
This would have the added benefit of the system needing less of the CPU
time to actually do the context switching.

     As far as the CPU scaling is concerned, Sun claims near linear
scalability.  At the recent SUPerG conference, one Sun manager gave the
following numbers:  23.9x scaling on a 24 CPU SF6800, 69x scaling on a
72
CPU SF15K, and 99x scaling on a 106 CPU SF15K.  Unfortunately, I have
none
of those machines available to me to do my own testing!

     If CPU horsepower is your limitation, then I would think you would
definitely want to consider either Linux or Solaris x86 on Athlon based
systems.  Lets face it, when you look at the fastest supercomputers in
the
world at www.top500.org, you don't see Sun near the top of the list.
However, as you already know, you must consider all the variables when
deciding on the platform to use.  HTHAL...


=-=-=-=-=-=-=-=-=-


On Mon, Nov 04, 2002 at 04:56:42PM -0500, Tim Chipman wrote:
> Hi Folks,
> 
> Part:1:
> 
> -> assume you scale a DB server from a quad-CPU-UltraSparc-II/450mhz
> (e450 calibre box) to a dual-CPU-UltraSparc-III/900mhz (v480 or v880
> calibre box), and given the (apparently more-or-less true?) assumption
> that performance for US-III CPUs is a fairly linear scale-up when
> compared to US-II CPUs,

Queuing theory proves that 2 times N processors are always slower than
1 times 2N processors.
N being the processing power, and 2N twice the processing power.

> -do we expect performance to be faster on the 900mhz CPU system, since a
> given (unit of work for single unthreaded process) can be ~2x greater
> for a 900 vs 450mhz CPU?

The problem get's worst with scheduling etc. The "best" typical scale up
is something like  N + 99%*N for a 2 times N processor system. For a 
4 procesor you'll be looking at:
 N + 0.99*N + 0.99*0.99*N + 0.99*0.99*0.99*N, because of locking etc.

> -do we expect greater capacity for multiple parallel independant
> processes on the quad-cpu machine, since there are physically more CPUs
> available to do work at any given instant? (or, does task scheduling
> more-or-less render this transparent / irrelevant? Is there "more
> overhead" to process/thread management on SMP systems with more CPUs, or
> is this more-or-less constant overhead? Linear overhead scaling with CPU
> count?)

It also depends on aspects that depends on the caches, ie. processes in
processor bound, and no cache misses, and the cache sizes, ie. 4x4MB vs.
2x4MB
or 4x4MB vs 2x8MB etc.

> Part:2: (more blasphemous in nature :-)
> 
> -> Assume:
> 
> --you have no issues with (stability, hardware selection, maintenance,
> support) of linux on reliable, "high performance" X86 hardware (ie,
> dual-athlon-MP @ 1800 mhz/"2200+"CPU, Ultra-160 SCSI-64-bit PCI disk
> subsystem//10k RPM SCSI drives, quality DDR ECC 266 ram, etc, no
> bleeding edge features in OS/solid kernel)
> --hence no trouble with > 300 day uptimes on such a platform, and then,
> downtime is only for scheduled maintenance, 
> --*given* a knowledge that approximately, for *purely* CPU-limited
> performance, athlon "true MHZ" is approximately equal to UltraSparc II
> CPU performance (when compared to CPUs of either e450 or e3500 server)
> [this has been true for tests I have performed, BTW, as I said such a
> case where IO is not an issue and CPU is the bottleneck - thus, a
> dual-athlon-1600mhz performs on par for CPU-limited task ~identically to
> an 8-CPU e3500 with 400mhz CPUs, or ~2x faster than a 4-cpu@400mhz e450]

I've always have the view that CPU bound stuff (ie. SETI@home, RC5DES,
fractals etc.) are better on clusters/arrays of Intel type HW (Like my
prefered AMD Athlons ;^), while I/O bound stuff (like DBs) are better
done
on Sparc type platforms with *decent* internal bandwidth (given the
disks are
correctly/decently setuped ;^)

BTW: the prices on Intel based platforms are cheaper for Oracle etc. :(

=-=-=-=-=-=-=-=-=-


=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
ORIGINAL POSTING FOLLOWS
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Subject:               Scaling Query: (4 x 450) vs (2 x 900)
   Date:               Mon, 04 Nov 2002 16:56:42 -0500

Hi Folks,

A general question in two parts, on the same theme. I'm hoping to elicit
comments from people who have actually done scaling "in this kind of
way", in particular (ideally) on DataBase (oracle, etc) server
environments, but really, anything with a mix of CPU and IO would be
comparable for my purposes probably. In all cases I'm assuming that I/O
is *not* the primary bottleneck, ie, that CPU usage is more of a
bottleneck than I/O (although clearly both are related to performance in
this sort of environment).

Part:1:

-> assume you scale a DB server from a quad-CPU-UltraSparc-II/450mhz
(e450 calibre box) to a dual-CPU-UltraSparc-III/900mhz (v480 or v880
calibre box), and given the (apparently more-or-less true?) assumption
that performance for US-III CPUs is a fairly linear scale-up when
compared to US-II CPUs,

-do we expect performance to be faster on the 900mhz CPU system, since a
given (unit of work for single unthreaded process) can be ~2x greater
for a 900 vs 450mhz CPU?

-do we expect greater capacity for multiple parallel independant
processes on the quad-cpu machine, since there are physically more CPUs
available to do work at any given instant? (or, does task scheduling
more-or-less render this transparent / irrelevant? Is there "more
overhead" to process/thread management on SMP systems with more CPUs, or
is this more-or-less constant overhead? Linear overhead scaling with CPU
count?)



Part:2: (more blasphemous in nature :-)

-> Assume:

--you have no issues with (stability, hardware selection, maintenance,
support) of linux on reliable, "high performance" X86 hardware (ie,
dual-athlon-MP @ 1800 mhz/"2200+"CPU, Ultra-160 SCSI-64-bit PCI disk
subsystem//10k RPM SCSI drives, quality DDR ECC 266 ram, etc, no
bleeding edge features in OS/solid kernel)
--hence no trouble with > 300 day uptimes on such a platform, and then,
downtime is only for scheduled maintenance, 
--*given* a knowledge that approximately, for *purely* CPU-limited
performance, athlon "true MHZ" is approximately equal to UltraSparc II
CPU performance (when compared to CPUs of either e450 or e3500 server)
[this has been true for tests I have performed, BTW, as I said such a
case where IO is not an issue and CPU is the bottleneck - thus, a
dual-athlon-1600mhz performs on par for CPU-limited task ~identically to
an 8-CPU e3500 with 400mhz CPUs, or ~2x faster than a 4-cpu@400mhz e450]

Then,

--> do we expect the answers from (Part:1:) above regarding
quad-slow-CPU to dual-faster-CPU scaling of a server platform to have
any relevance?
--> has anyone actually migrated a production database from e450 calibre
(quad-450 UltraSparc II) gear over to (linux X86 Dual CPU gear as
described) and found the performance to scale (poorly / decently /
nicely?)


Clearly, I realize that there are *tons* of variables when attempting to
compare (1) Sparc SMP based on UltraSparc II vs UltraSparc III CPUs, (2)
UltraSparc of any kind vs. X86 CPU/Hardware platform (ie, CPU cache,
data bus, etc) (3) Solaris vs Linux OS platform issues. However, in
part, I'm hoping to elicit comments in these veins from people who have
actual experience with such a migration / with similar server
functionality on the different platforms discussed here.


Also, clearly, I realize that X86 gear is NOT as scalable for a pure-MP
system as UltraSparc-based equipment; however, this isn't my concern,
especially since (last time I was aware?) the bulk of functionality
served by Sun/UltraSparc based gear is NOT in 16-CPU (and up) MP boxes
but rather in (gasp! single), dual, or quad CPU -based systems.

Any comments or feedback regarding these questions are certainly very
much appreciated. As always, I'll summarize to the list.


Thanks!


Tim Chipman
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Mon Nov 25 17:46:52 2002

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:42:58 EST