[lug] Hard Drive Failure / somehow software issue?

Nate Duehr nate at natetech.com
Tue Feb 12 18:44:47 MST 2008


Bear Giles wrote:

> To be clear, I wasn't saying that RAID can't handle a single disk 
> failure.  The problem is getting a false sense of security and ignoring 
> explicit warnings of (incipient) hardware failure since "RAID can handle 
> it".  It can only handle one bad disk typically, then you're losing data 
> again.  RAID buys you a bit of extra security, but you still have to 
> maintain the hardware.

Correction -- RAID 5 by itself can only lose a single drive.

RAID 1 with RAID 0 layered on top, in a minimum four drive configuration 
can lose two in most scenarios.

Wikipedia's descriptions of RAID and layered RAID systems is pretty 
good, and they also discuss "non-standard" RAID like Sun's ZFS, which is 
pretty darn cool...

http://en.wikipedia.org/wiki/RAID

Layering makes things more complex, but offers more redundancy at the 
cost of disk space.  The added benefits of faster writes (because no 
parity sums need to be calculated for each write) with RAID 1/0 can also 
be significant if you're planning on doing a lot of heavy writing to the 
filesystems on the RAID.

There's also a minimal discussion of Linux's so-called MD-Raid 10, in 
the Wikipedia article, which is a driver-level RAID implementation that 
can use odd numbers of disks... which is weird.

There was some discussion of how it is used and configured on some of 
the Debian lists lately... very odd.  You tell md to build it, on which 
disks, and it just "does it"... and Linux can see it.  Non-standard, but 
interesting.

I'm sure there's better places to read about it than Wikipedia... I'm 
just sharing that it's "out there" for those who like to tinker.

So...

RAID 5... 4 disks @ 500 GB, you end up with roughly 1397 GB of usable 
space before adding overhead for a filesystem, and only a single drive 
failure is possible.

RAID 1/0 layering... same 4 disks @ 500 GB, you drop to only having 1000 
GB of usable space pre-filesystem, but you can lose two disks.

Generally, since there's no parity calculations going on, RAID 1/0 will 
do writes quite a bit faster than the RAID 5, with all other things 
being equal.

It's a big hit in data space, losing 300 GB... but if the machine MUST 
be up... perhaps a better option than RAID 5.

For data disks, this new MD-RAID 10 driver that's built into Linux md is 
another new/interesting idea.  It'll handle odd numbers of drives, you 
just point md at whatever disks you want RAID'ed, and the driver handles 
doing it correctly.  Single command to build a 1/0 setup, no more 
layering, and the kernel can read it natively.  Very odd, but also cool.

Less control (the driver handles the RAID for you), but maybe "simpler" 
for home setups... not sure, since I have only seen postings about a few 
people using it.

Etc... etc... etc...

Sun's ZFS is also "out there" getting some mindshare.  It's very 
interesting, but it does have some drawbacks.  Check out this video for 
some fun with ZFS...

http://www.opensolaris.org/os/community/zfs/demos/basics/

Managing ZFS looks WICKED cool... almost brain-dead, really... so if any 
of it's performance limitations don't cause you heartburn, it's probably 
one of the things that will be the "wave of the future"... none of the 
RAID stuff has natively made any effort to be "administrator-friendly" 
until now, really.  Maybe commercial products like Veritas, especially 
later versions (post 4.0)...

And of course... for sheer entertainment value, Sun's ZFS "CSI" video 
from Germany is fun... even if it does make a serious boo-boo in 
Marketing... they themselves proclaim their own server is "too 
expensive" as the premise for using a pile of... well, you watch and 
see... it's fun.

http://youtube.com/watch?v=1zw8V8g5eT0

Nate



More information about the LUG mailing list