[lug] Robust storage
D. Stimits
stimits at comcast.net
Sun May 1 15:54:04 MDT 2005
Daniel Webb wrote:
> I just noticed I have file corruption in an old mailing list archive (gzip
> fails about 1/4 of the way through). It's not one I care about much,
> but it's got me thinking about the issue in general. I have no idea
> when this corruption happened, sometime in the last two years. Here are
> some questions I have for the experts out there:
>
> * How can I know if files have been corrupted through hardware errors?
> Would Linux software RAID have prevented this?
I doubt RAID would help. If anything it would add complications, since
your corruption was at the file level and not the filesystem. RAID can
build copies and thus detect mirror errors, which would possibly aid if
say a disk defect caused the error...but most defects don't simply show
up as a bit alteration, they tend to show up as fatal filesystem errors
(which might be fixed by bad block relocations).
> * How can I know if files have been corrupted by bugs in the low-level
> block drivers (the filesystem drivers or in my case drbd)?
> Would Linux software RAID have prevented this? What happens if the
> corruption is cause by the RAID driver?
I'm guessing you need more of a filesystem journaling and that RAID
itself will only help with overt failures. You can use ECC ram to avoid
random bit failures in RAM (and believe it or not, apparently radiation
hitting the ground from space, cosmic rays, can cause a random rare bit
alteration and does so more frequently at higher altitudes like the
Boulder area versus sea level).
> * What are some inexpensive solutions to this problem?
Journaling filesystems, and UPS that guarantee full power during
brownouts and low voltage situations. Brownouts and low voltages in
general are a big danger to data corruption because the hardware doesn't
necessarily know voltage is too low, and data can be altered without
actual failure. Not all UPS provide power during low voltage, cheaper
UPS will provide power only during overt failure. Never go without an
UPS and never use an UPS that doesn't handle brownouts. On a similar
line, if you have a power supply that is marginal and can under some
circumstances act like a self-contained undervoltage (such as surge
current use by a CD drive or HD starting up), an UPS won't help. Don't
use an underrated power supply.
Among journaling filesystems, beware that most all journal metadata
only, and not full data (full data journaling is a huge performance
hit), so it can restore to a particular time but not necessarily all
data will be available that was writing at the moment of failure...you
might lose a few seconds, but the filesystem will be undamaged. But then
if you try to write garbage you'll just get a good copy of garbage anyway.
D. Stimits, stimits AT comcast DOT net
More information about the LUG
mailing list