[lug] Hard Drive Failure / somehow software issue?

User Dan Ferris dan at usrsbin.com
Tue Feb 12 16:54:33 MST 2008


I have dozens of Linux software RAID arrays running.

One of them failed a hard disk last week, and it was caught by nagios, and 
mdadm.  The system is still running and due to be replaced.

Dan

---

The LORD God is my strength, and he will make my feet like hinds' feet, and he will make me to walk upon my high places.

On Tue, 12 Feb 2008, Sean Reifschneider wrote:

> George Sexton wrote:
>> that. In reality, SW RAID under Linux is virtual useless for maintaining 
>> availability through a failure. SW RAID under Linux can prevent data 
>
> I wish you were right on that, because if the system had locked up when one 
> disc failed I wouldn't have had a client contact me late last week when the 
> second drive in their 3-drive RAID-5 failed.  This was one of their 
> development boxes (we're only responsible for their production machines), and 
> either the software RAID alert never got to their mail server, or it was 
> ignored (the system seemed correctly configured to e-mail an alert on 
> failure).
>
> As we've said before, it really depends on a number of things like the 
> controller and how gracefully it handles and reports failures to the software 
> RAID.
>
> Sean
> -- 
> Sean Reifschneider, Member of Technical Staff <jafo at tummy.com>
> tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
>
>
>



More information about the LUG mailing list