[lug] RAID installation on Fedora 6 Zod

Sean Reifschneider jafo at tummy.com
Fri May 11 15:46:29 MDT 2007


On Fri, May 11, 2007 at 03:24:08PM -0600, Nate Duehr wrote:
>Hot-swap that works flawlessly is a requirement, in that environment, 

I disagree.  I'd rather have a pair of machines with cold swap hardware
running in a HA configuration than have one machine that has all sort of
hot swap hardware.  But I'd also rather apply updates in a timely manner on
the backup system and then fail over and test...

This is exactly what we did for part of the billing system in a large
telecommunications company, where the system was responsible for a large
fraction of a million dollars a day worth of interest revenue alone.
If the system was down for a day, they lost that much money just on the
interest on the money the system was handling.

In this case, the systems had redundant power supplies, pulled from
different PDUs, but the storage was connected to two physical machines and
if one failed the other would grab the storage and take over.  Each drive
in a mirror pair was on a different SCSI bus and also was powered from a
different PDU, to prevent a mis-behaving component which takes down a whole
SCSI bus from bringing the system down.

Anyway, I'd still say that it's better to have independent, redundant
systems than hot swap in a single system.

This is exactly what we do for our routers at our hosting facility.  There
are two lower end routers running with fail-over configured.  For software
updates we will do the secondary first, test it and then fail wen we're
happy with it.  Once we are happy with it's performance, we'd upgrade the
primary and fail back.  If the upgrade to the backup router has problems,
we can back out just by failing back to the primary.  It's been designed
such that the failovers happen in such a way that no interruption in
networking is seen.

This is great for keeping availability up because software and
configuration changes can be done and tested without impacting production.
and there's at least one clear back-out path as well.

Sean
-- 
 C-Kermit.  C-Kermit Run.  Run Kermit Run.  -- Sean Reifschneider, 2003
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
      Back off man. I'm a scientist.   http://HackingSociety.org/




More information about the LUG mailing list