[lug] Hard Drive Failure / somehow software issue?

bgiles at coyotesong.com bgiles at coyotesong.com
Tue Feb 12 14:15:45 MST 2008


> hda: dma_intr: status 0x58 { DriveReady SeekComplete DataRequest }
> ide: failed opcode was: unknown
> hda: set_drive_speed_status: status=0x58 {DriveReady SeekComplete
> DataRequest }
> ide: failed opcode was: unknown

Short story: if you get error messages, replace the disk.  They're cheap now.

> I'm baffled: If this is hardware, why do I only have problems when I boot?

Temperature, perhaps.  Or the lubricants breaking down.

Years ago I would routinely have 700+ days uptime.  Then when I went on a
trip and rebooted the system, it wouldn't come up.  But if I waited 30
minutes (with the computer still powered on), it would successfully
reboot.  I don't recall the details, but on some disks the lubricants
would polymerize(?) during really long uptimes so they were fine once the
system was running, but you would failures for the first 30-60 minutes. 
In the worst cases, with really old-time disks, you might even need to
wack the side of the disk to mechanically loosen it up.

> If it is software or something, why isn't it consistent between boots? And
> I've got a raid on everything... even if it were hardware, the raid should
> be able to handle that.. and when it boots fine, the raids don't even have
> to resync.

I can't remember, is it Sean who sees this as the bat signal?

RAID is dangerous because "it should be able to handle that".  It does...
and then people are complacent and ignore the hardware errors until
multiple disks go bad and suddenly they've lost everything.  Or at least
it's very difficult to get it back, depending on the RAID configuration
and which disks went.




More information about the LUG mailing list