[lug] My last hope....and nerve
D. Stimits
stimits at comcast.net
Mon Oct 29 18:28:17 MDT 2007
Steve A Hart wrote:
> I'm still dealing with two Promise UltraTrak RM8000 raid arrays and
> I'm getting desperate to find an answer to my problem. Here's the
> rundown and hopefully someone out there can help.
>
> Let's keep this simple. I have a single Promise UltraTrak RM8000
> connected to an LSI logic SCSI card. The OS is Fedora Core 6 and when
> the OS starts up, all I see is a repeating SCSI bus reset over and over.
>
> I can say with 100% certainty that the problem is NOT the following:
> SCSI host ID
> The SCSI cable
> the terminator (terminated correctly)
> LSI card
> motherboard of the host system
>
> That only leaves the OS and the promise raid itself. I know the
> RM8000 did run on FC4 running the 2.6.16 kernel but ever since the
> 2.6.18 kernels came out it's has not worked. Now I have it connected
> to a FC6 system and still no luck.
A few things for the desparate to try or consider...
Drives themselves often go bad, and it isn't unusual for a batch of
brand new drives to arrive with many bad. Can you try to swap the drives
themselves between the good array and a bad array? Look very closely at
the pins, and the "feel" of how they seat into the bays. Swap the entire
set of drives, not just 1 at a time. Get a single bay hot swap
carrier/tray, and test them one at a time as well (this is an
extraordinarily useful and cheap test tool).
Feel the temperature with fingers on each of the drives in the working
and non-working sets, and see if something stands out as significantly
different, either cold or hot, it might point something out.
Try to format and mount individual partitions made from each disk, not
in a RAID set or LV...simplify it to the simplest use of each disk if
possible, without other layers on top of it.
Perhaps if volume labels are used, there is a naming problem...try
mounting those individual partitions by exact /dev/ name, without any
kind of automount and without any kind of label. Remove all fstab
entries with labels while doing this.
If the disks are identical, and formatted identically, then after an
fdisk dump of geometry, a pipe through sort and uniq should be short:
fdisk -l | cut -d ' ' -f 2- | sort | uniq
(the cut is to remove the field with the drive name, e.g., /dev/sda,
sort followed by uniq removes duplicates, and there should be many
duplicates...what remains should be similar)
Perhaps searching with badblocks might indicate trouble on a boot
record...a long process.
If you run lspci -b, you'll notice that PCI bus listings are of the
format of bus:device:subdevice (not technically, but that's the basic
idea, since a PCI bus is bridged to other PCI busses, and a given
physical device can contain more than one function, e.g., a sound card
can contain standard sound + joystick controller + midi). Remove or swap
devices which compete on the same bus...sometimes devices do not play
fairly for DMA control in a buggy way which collides with another PCI
device (in which case moving to another slot will make it work).
Physically swap anything involved, and look for any change in behavior,
see if anything is in common.
D. Stimits, stimits AT comcast DOT net
More information about the LUG
mailing list