SUMMARY: Q: [Oracle 8i e450 UFS] Solaris8->9 upgrade: Performance Boost?- Real World Experience ?

From: Tim Chipman <chipman_at_ecopiabio.com> Date: Wed Nov 05 2003 - 14:06:51 EST · This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:21 EST

Sorry for the absurdly late summary. "Better Late Than Never[?]" I hope. 
MANY thanks to those who replied [no-particluar-order]: Sebastien 
Daubigne, Alex Avriette, Buddy Lumpkin, Neil Guiogue, JV.

The (relatively) quick "executive-type-summary" follows; then a few 
verbatim quotes which are especially relevant.  See [way below] for the 
original posting.

-> Alas nobody has done "exactly what I am asking about" (Sol8->9 
upgrade of e450 Oracle server, evaluate/noted performance changes with 
upgrade and also UFS parameter tuning, in particular, "forecedirectio"

-> It appears that Solaris8 Update3 and later have all contained this 
feature, "Concurrent DirectIO", which ?apparently? is of greatest 
potential benefit to environments experiencing "Parallel Write-Intensive 
Database Access". ie, not just Solaris9 has this .. but all solaris8 
since MU3. One response indicates that these features are actaully 
present in solaris8 since release 01/01, (as suggested at the URL 
http://solaris.java.sun.com/articles/performance.html ).

-> one respose indicated that enabling DirectIO on their system after an 
upgrade to solaris9, performance "appeared to increase" but this wasn't 
a hard benchmark. [see below for exact text/comment]

->one response suggests that systems experiencing "Tons of IO" [service 
times of 500-1000ms per request] are those which benefit most from this 
sort of tuning, and would especially benefit if migrated towards storing 
data on a character device [Raw device, Quick I/O, Direct I/O, etc..] 
rather than on a block device. [see response2 full text below, a bit 
longer/more details]

->one response suggests, the upgrade is almost certainly a not-bad 
thing, potentially will give some small boost "in general", but that 
hoping to get magical boosts via directIO mount options is improbable. 
If magic is required, maybe an investment in vxfs is in order, which is 
suggested to be "ALWAYS faster than UFS".

-> Finally, don't forget to use "noatime" and "logging" mount options [I 
was] since these should be of some benefit. Plus of course, examine how 
oracle data is stored / distributed across controllers/disks and if 
possible try to optimize. Plus of course, app level (query tuning,etc) 
optimizations are often the best way to enhance performance, since 
legacy-legacy-obselete-workarounds are a surefire way to degrade 
performance (ie, "why are we doing things this way anyhow?" syndrome :-)

All this, finally, to say: I haven't yet upgraded the box to solaris9 
but I plan to do so sometime in the not-too-distant-future. We 
previously tested directio mount option (and given this is solaris8 
10/01 release, which is later than 01/01, it suggests I've already got 
all the directio magic on-board already) and found it caused a drop in 
performance / increase in CPU load / increase in IOwaits. Clearly in 
light of this I can't expect Sol9 DirectIO to make things go faster but 
I'll probably test it again just to be certain.

Many thanks again to everyone for their replies. I hope this info gets 
archived and is of use to someones else, somewhere/somewhen.

--Tim Chipman

========================================================================
===Response One- re: Sol8->9 upgrade, subsequent DirectIO enable==

I can tell you that the Sun Blueprints book, _Tuning Databases for the
SOlaris Platform_ references directio specifically. We run our Oracle 9
database here on directio disks, and I have been very pleased with the
performance. ...[mini-snip]....... However, my personal tests (I have
"wellness" tests for the database which I run nightly) seem to indicate
that the database is faster than it was without directio. However, you
can actually mount filesystems with directio on Solaris 8. I cannot
confirm that the directio option in 9 is faster than 8, we weren't using
it in 8. When I made the switch from 8 to 9, I got our new raid set up,
and made sure Oracle was happy with that. It was. I then switched on
directio, and it was even happier -- and the DBA commented that something
had "changed" and the database seemed faster. He no doubt is plotting
a way to slow it down for me.

========================================================================
==Response Two - re character-vs-block devices & theoretical issues.==
On a system that is heavily I/O bound, you might see some fairly high 
gains, maybe 20-30%, but I would only predict an 80% gain if the system 
was totally on it's knees because of thrashing in the VM. Here's how 
this works:

Any "filesystem" I/O in Solaris uses the page fault mechanism either 
directly via mappings obtained via mmap() which causes an as_fault() 
(page fault) to occur, or indirectly thru the segmap driver for reads 
and writes which also causes an as_fault() to occur.

Even if a file is opened with O_SYNC (which is what Oracle does BTW) on 
cooked filesystems, a write call will block until the write succeeds, 
but all of the data will still end up in memory as each page is faulted 
in via the page fault mechanism.

The problem is since all I/O is forced thru physical memory, there comes 
a point where the VM system is literally your bottleneck.

In Solaris 2.6 this was worse because once physical memory reached 
lotsfree (or cachefree with priority paging enabled) the scanner was the 
only method for evicting old pages to make room for new ones.

In Solaris 8, they implemented a cyclical page cache so that once the 
segmap is full of 256mb worth of mappings, it would take the pages 
referenced by the oldest mappings and put it on the cache list. This 
solved a memory leak that was present in Solaris 2.6, but doesn't solve 
the problem that the VM system wasn't designed to handle tons of disk 
I/O to the filesystem.

Using a character device (Raw device, Quick I/O, Direct I/O, etc..) 
aleviates this because any I/O to a character device vectors directly 
into the device driver for that device. The device does not have to 
adhere to the vfs/vnode abstraction interface that was put in place for 
all block I/O (each filesystem has an ops structure that implements 
read,write,ioctl,etc.. for that filesystem). For a character device 
reads and writes are funneled directly to the disk they represent and 
completely bypass the page fault mechanism.

So that's theoretical and you asked about real world experience, right?

I can tell you first hand that we have experienced significant 
performance gains by going to quick I/O and Solaris 8. It's hard to 
quantify which part of the gain is the Kernel Asyncronous I/O and which 
is bypassing the VM system, but I can tell you, there's a solid 30% gain 
to be had on a system that's doing tons of I/O.

What's tons of I/O?

Watch iostat output, in particular the average service time. It's not 
the most accurate metric because it speaks of the total amount of time 
for the I/O to be serviced, including the amount of time the I/O was 
queued up in the device driver, but if your getting service times that 
start to approach .5 to 1 second (500ms - 1000ms) you are heavily I/O 
bound and will notice an improvement switching to a character device and 
optimizing the storage end as well. Keep in mind that if the I/O sizes 
are extremely large like during large table scans it will be normal to 
have higher than average service times, but 500 - 1000 is still too much.

We use Quick I/O on all of our large Oracle Database systems. There have 
been times when files were added and the DBA forgot to convert the file 
to Quick I/O. We noticed a difference when only 9 files were not Quick 
I/O after converting them to Quick I/O.

========================================================================
===ORIGINAL POSTING==

Hi All,

I've done a bit of digging but cannot find anything other than "Sun 
promo material", and ideally, I'm hoping to get some "real world 
experience from the trenches" which is relevant.

We've got an Oracle8i database running on a "Solaris8 e450" 
(4x400mhz/2gigs ram/solaris8 10/02 fully patched, "important data" {WRT 
performance} is on a T3 and less critical data is on an A1000; using 
vanilla logging UFS filesystems for the oracle data slices).

Reading the solaris9 release & feature notes, I am particularly tempted 
by the "UFS Concurrent Direct I/O" features that claim to provide (in 
certain circumstances) "87% performance increase over directIO alone" 
[this stat quoted from the URL 
http://wwws.sun.com/software/solaris/sparc/solaris9_features_scalability.html 
  ]

However, being familiar with the reality that "certain circumstances" 
often refers to .. conditions that will not ever be relevant to me ... I 
thought I would try to get any feedback from real-world deployments 
which have migrated Oracle8 DataBases from Solaris8 to Solaris9 while 
staying on the same "not terribly current Sparc Hardware", and if there 
were indeed any performance increases (decreases?) observed.

Ultimately, I realize the only way to be *certain* is to jump in and try 
it myself -- but prior to this I thought it would be prudent to hear 
(ideally) at least a couple of success stories to help sway my reluctant 
stodgey "things work just fine as they are right now, thank you very 
much" side of myself

Many thanks, and as always, a summary will follow .
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers