SUMMARY: SVM vs hardware mirroring (raidctl or arcconf}

I received rather a wide variety of responses -- thanks to all who took the
time to share their views.  A few were from people using other hardware and
thus not directly applicable.

o Only a couple of respondents mentioned any sort of monitoring. One seems to
have hosts only in his/her office area and relies on visual indicator lights
on the systems, on consoles up on desktops, or manual log scans.  My group's
situation is perhaps different from many:  we have concentrations of hosts at
two large domestic DC's, ones that are staffed by other groups.  There are
processes for other groups performing hands requests, but response time is
variable.  We also have a number of systems scattered around other domestic
locations and two other continents (with a third poised to enter the mix), and
hands there are often have even more variable skill/response time, especially
in locations where we don't have staff and rely on third parties.  Monitoring
and resilience are thus critical for us.  Loss of a system bootability /
functionality in another country could take weeks or months to resolve.

o Several respondents are pleased, or at least satisfied, with the LSI /
Adaptec RAID HBA's.  One mentioned that this approach is followed due to a
recommendation by Sun.  I've had a similar recommendation from Sun myself in
the past, on one of the rare occasions when I got connected directly to a
high-tier support tech or actual engineer.  The latter told me, though, that
volumes [using x4?00 onboard HBA, at least] aren't necessarily transportable
between hosts or HBA firmware versions, which concerned me a great deal.  If a
chassis dies on me, I want the ability to swap the disks into another - a
strategy that I've had to take at least once in the last year due to the
maddening process of trying to straighten out Sun/Oracle support contracts and
access to them via Sunsolve/MSC.

 Another issue I just discovered with the Adaptec HBA RAID is that it doesn't
handle 3-way mirroring.  The docs seem to claim that mirroring with an odd
number of mirrors isn't feasible, which is puzzling.  One respondent suggests
the use of what Adaptec calls RAID 1E, which would seem to have inferior
failure-tolerance than straight mirroring.  This is a big detractor for me.
Another is the software required to manage/monitor Adaptec RAID HBA volumes,
known as Sun StorageTek RAID Manager and various other names.  For reasons
that aren't clear to me, Solaris doesn't ship with it.  Sun's docs for the
card say to install the stuff from media shipped with it, which I'm wary of
given that software on bundled media tends to be distinctly out of date (like
the ILOM/BIOS discs I've seen) and there's no clear update mechanism.  I asked
Sun about the stuff and they pointed me at a page on LSI's site for no
apparent reason, and to an out-of-date version on *Intel's* site.  I
subsequently found a more recent version on Adaptec's site, but I'm hesitant
to rely on such, especially as one respondent directed me to an announcement
that Adaptec is basically being sold for parts, so the future of their RAID
HBA products in the hands of the buyer of that technology is rather uncertan
-- but of course the future of Sun's x4??0 hosts under Oracle is itself a very
big unknown.

o To my surprise, one respondent prefers to *not* mirror the boot filesystems,
preferring a snapshot/clone strategy on a second internal disk as protection
against accidental mistakes on the live disk.  This respondent's organization
makes heavy use of load-balanced or clustered systems, apparently with the
idea that an individual system failure is tolerable and can be fixed during a
planned outage.

o Finally, several respondents recommend the use of ZFS.  We've been using ZFS
for a while now for a number of data volumes, and while stability has
increased with recent OS and patch levels, I'm still a bit wary of it given
the pathological behavior I've seen when a disk fails.  I'd been hesitant to
try ZFS root, especially with Live Update in the mix, but perhaps with Sol10u8
it's stable enough to give a whirl.  This appeals in that it's managed
identically across hardware models, with bundled utilities, and 3-way mirrors
are straightforward to set up (though jumpstart doesn't do them directly -
we'll have to add the third mirror manually).  I'm going to set one of these
up to see how it goes.
