[lug] Robust storage

Sean Reifschneider jafo at tummy.com
Mon May 2 01:44:36 MDT 2005


On Sun, May 01, 2005 at 05:19:20PM -0600, Sebastian Sobolewski wrote:
>Actually it's worse then that.  I've had a case were reading from one 
>mirror returned different data then from the other in a raid 1 set.  ( 

In a RAID-1 set you can recover from disc failure, but you can't recover
from corruption of either disc -- there's no way for the system to detect
which device has the "real" data.  With RAID-5, in theory if one set of
data were bad you could detect this and then figure out which of the discs
had bad data, as long as only one of them had corruption and it wasn't the
one with the parity.

Of course, this would add extra overhead to every read, so vendors don't do
it.

>One of the disks was returning a silent,uncorrected bit error on 3 bits 
>in a single block ) RAID1/RAID5 are great for surviving single disk 
>crashes, but you can't depend on it for anything else.  For instance if 
>the FS goes bad, you now have 2 copies of the badness.

Correct, RAID is not a substitute for backups.

>of battery left ) .  All files have MD5 hashes calculated and the hashes 
>are validated to make sure any files that have not changed in the last 

For one of the solutions we built for a customer, where we were archiving
data, we did basically this.  Their data doesn't change once committed to
disc, and the data files were stored in a database, so I just set up a job
to search the file-system and verify the availability and checksum of the
files.  We had horrible problems with ext3, but have had very good luck on
that system with JFS.

Sean
-- 
 "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
tummy.com, ltd. - Linux Consulting since 1995.  Qmail, Python, SysAdmin
      Back off man. I'm a scientist.   http://HackingSociety.org/




More information about the LUG mailing list