Summary: Dual SCSI adapters for mirroring root disk overkill ?

From: bill.walker@db.com
Date: Wed Oct 20 1999 - 03:18:41 CDT


My original question is at the foot of the mail.

Lots of good replies on this (too numerous to thank everyone individually - but
Thanks to you all anyway), with the general
consensus being that if you can afford to (or cannot afford not to) then go for
dual SCSI adapters for root disk mirroring.
The cost of loss of service and rebuild often out weigh the cost of a PCI SCSI
adapter ( UKP £ 223 ).

Equally so, many pointed out that SCSI adapters rarely fail and they were happy
to mirror on the same SCSI controller,
although general consenscus was that there were potential performance benefits
from the dual SCSI approach.
Some asked why use an Ultra 5 as server - these are sometimes used as NIS
masters, market data proxies etc -
important yet lightweight services.

Some interesting comments on SCSI + IDE mirrororing too.....

Here are a few sample replies :

"From a performance standpoint, you'll do better with two scsi adapters
(this setup is usually known as duplexing, mirroring doesn't use the extra
controller), and if you have one controller fail, a spare is nice.
 However, scsi controllers don't fail anywhere near as often as drives, so
mirroring alone (where performance is not the main issue) is a good
compromise between cost and reliability."

"I have worked with hundreds of Sun boxes over the last 15 years,
and have had countless disk failures. But, I could count on 1
hand the number of scsi controller failures. In some cases even
if the controller fails, the disk and the data are still
useable, so I don't think that separate controllers are necessary.
It is still a nice option to have, and when designing a system,
I would make it part of my specifications, since the relative
price over the life of the machine, is peanuts compared to the
cost of rebuilding a system in the event of a failure."

"I have been doing IT work for years, and have yet to have a SCSI controller
fail. However, I still do the dual card option for redundancy.."

"Bill, I always do it...

There is a single caveat, what is the machine being used for in terms
of Business Function, this is likely to be the key factor in any
cost/benefit analysis, you should ask yourself a very cynical and
real-world question"

"Can I afford politically or financially to have this box go down for
a scenario that could have been avoided by spending £xxx.xx?"

Do not lose sight of the fact that MTBF are not real, they are
averages, and you would need to use a lot of SCSI controllers to
achieve this average! If you are asking yourself about MTBF's of SCSI
cards what about disks? These day's they are longer the useful life
of the disks, we all still mirror them though, because the recovery
time and cost of a failure is not acceptable"

"For an enterprise production server, mirroring the OS on seperate scsi hdd and
different scsi bus is not overkill, it is a necessity. You can't expect to be
secure if your OS is on the same scsi bus, what will happen if this scsi bus
fails?
The machine will crash, stoping the whole company. Not a pretty sight.

On the other hand, since the U5/U10 are workstations, the extra precaution of
dual scsi adapters and mirror of the OS on different hdd is useless. Useless
because you do not have N+1 power-supplies, fans and hot swap hdd. Why pay more
for a workstation? If it crashes, you can have it up and running easy on the
same
day and it will only affect one or two users, not the whole company as in a
production server crash. If you have nfs mounted fs from this workstation, then
replicate these fs on a minimum of two stations. (in fact, you should use a
production server for these tasks of nfs mount point and fs, because a server
is less likely to crash than a workstation. Sometimes a machine with nfs
mounted
fs can hang if the fs host goes down, which is not good.)

In brief, dual scsi adapters and seperate scsi bus for the mirror drives of the
OS
is an absolute necessity in a production server.
But, for a workstation (such as the U5/U10) it is a waste of money and time.

Of course, the scsi hdd will give you better performance than the ide drive, but
you have to figure out if it is really worth the price. For a normal
workstation
on which big mathematics and calculs will not be done, ide is fine. But if the
machine will be used for extensive calculs in which swap is going to happen,
scsi
will outperform ide big time. Ide drives really suck when they're extensively
used for swap.

I'm looking forward to your summary on this one ;)"

"Real world experience...

We had an Ultra 2 server with two external multipacks which were mirrored to
each other on different channels. An admin accidently pulled the plug on
one of the disk packs. All we had to do was re-init the mirrors and
continue on as normal, no interruption to the customer.

Also, with only one channel to your disks, how do you handle a failure of
the scsi card (which has happened) or a failure in the storage unit
(multipack or A1000)? You can check the MTBF of devices, but the one time
you loose that device before it's MTBF has been reached and you take a
corporate outage on it, they'll be asking how it could have been prevented.
It'll probably cost you about $4,000 list to add a card and 18GB multipack
to the machine. If you can survive a system being down while you replace
the scsi on-board the server then ask yourself if it's worth it."

"We do exactly the same thing. Although MTBF on SCSI controllers is probably
important (we have also never seen a controller fail), but having both sides
of a mirror on the same SCSI bus has performance issues. With the two sides
of the mirror on different busses, a write of a file takes as long as it would
if the filesystems were not mirrored. No performance hit. However, if both
sides of the mirror are on the *same* SCSI bus, each filesystem write would
take twice as long because it would be written twice on that bus. For a
single write its negligible, but if the developers are planning to have any
disk IO at all, you probably want to insist on two busses for the two halves
of the mirror purely for performance reasons."

"It depends on how interested you are in avoiding the outage. The dual bus
approach will also protect you against a SCSI device failure that hangs the
bus. You may also want to consider mirroring across 2 IDE controllers in
the U-10 configuration. Obviously, you don't want to mirror an IDE drive to
a slave device on the same controller, since failure of the master will make
the slave unavailable, but using the secondary controller should give pretty
much the same redundancy as dual SCSI controllers with less hardware cost
involved."

"It may be overkill, but you may be asking the wrong question. The real question
is:
   What will it cost if it fails vs. what will it cost to prevent it?"

"We haven't seen any problems with simply keeping the IDE drive installed and
mirroring to the external drives on the SCSI card. We generally view the
disk as a more likely source of failure than the card and do not see the
need to keep to SCSI for both. As we would need the SCSI card for a tape
drive anyhow it's cost effective to get a disk to go on the chain.

You may be right about performance, however we generally don't touch the
disk at all even for swap."

"I have never tried, bu I would expect ODS to happily mirror across one IDE
and one SCSI drive...

The problem with two SCSI disks on one controller is that writing will
require two sequential writes on the same controller. With modern
controllers and disks this may not be noticeable.

Also, a disk could potentially fail in a way that affects the bus. Perhaps
not by disabling it completely, but by causing a lot of resets, etc.

On the other hand, SCSI controllers seldom die. Disk and cable problems
are the common ones. And with internal disks cables wouldn't be a problem.

If you want external devices (tape or other), then I would recommend
having 2 controllers (perhaps IDE + SCSI)."

"It really depends on your needs, doesn't it?

I have servers that, if they were to fail, I'd be perfectly OK saying "OK,
give us a day or two to bring them up."

"I have servers that I must bring up ASAP.

I have servers that must not go down unintentionally, but if they do, we
lose _some_ potential revenue (probably not more than about $1K/hr), but not
much.

I have friends who are responsible for servers that, if they were to go down,
would cause losses of MILLIONS per hour.

And some people are responsible for servers that literally could cause
people to die if they were to go down.

Are dual-SCSI controllers overkill in the first case? Probably.
Are they overkill in the last case? No.

For what it's worth, I tend to like spending money and we are mandated to
'do it right,' but I don't put dual SCSI controllers on anything but my most
critical servers (and for what it's worth, these servers are not hosted on
U5s -- they're hosted on E4500s)."

MY ORIGINAL QUESTION

For some time our mandatory standard for all production servers is to mirror the
root disk
across separate SCSI controllers (using VxVM or ODS) for resilience purposes,
but with the added
benefit of possibly better performance.

With the advent of smaller PCI based systems (e.g U5 or U10) with internal IDE
drive, this entails us
now installing two PCI SCSI adpaters and two SCSI disks (e.g one external ; one
internal ) and removing
the IDE drive .

One of our deployment team is (understandably) questioning whether the
additional cost of having two SCSI
adapters is necessary and that mirroring root across two SCSI disks on a single
SCSI controller
should be acceptable. He's asking for MTBF stats for PCI SCSI controllers etc.

I'm open on this - i.e can equally attempt to justify the need for dual SCSIs or
see his point - I can't remember the last time
I had a SCSI adapter failure.

What are the groups opinions on this : Are our current standards for dual SCSI
adpaters overkill in this day and age ?

Bill Walker

Deutsche Bank, London



This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:13:29 CDT