[lug] More Server Problems
George Sexton
gsexton at mhsoftware.com
Thu Feb 16 10:23:43 MST 2006
> -----Original Message-----
> From: lug-bounces at lug.boulder.co.us
> [mailto:lug-bounces at lug.boulder.co.us] On Behalf Of D. Stimits
> Sent: Wednesday, February 15, 2006 7:57 PM
> To: Boulder (Colorado) Linux Users Group -- General Mailing List
> Subject: Re: [lug] More Server Problems
>
> George Sexton wrote:
> > My server tanked again (reference):
> >
> > http://archive.lug.boulder.co.us/Week-of-Mon-20060123/031529.html
> >
> > Since that thread I upgraded to 2.6.14.6 kernel version and
> I'm still having
> > the same issue. After the server crashes (message unknown
> but I'm guessing
> > its ReiserFS file system corruption related) I have to run
> reiserfsck and
> > use the --rebuild-tree option.
> >
>
> I'm curious if you are using default journal options, or if you've
> experimented with things like placing journals on other
> disks? If not, I
> was just thinking that a bad block right at the journal might be
> difficult for the system to recover from...maybe if you can test
> different journal options.
I only have two drives, and they are configured as a software raid pair. It
seems like the linux kernel software raid should be handling the odd bad
block.
>
> > I ran memtest86 version 3.2 through 4 complete cycles and
> found no memory
> > issues. I also checked the hardware monitoring from the
> bios. It looks like
> > the temperature is well within reason for CPU and
> motherboard (31-37 C).
>
> memtest86 4 times is nice, but probably not definitive. The
Agreed. 24 hours would be nice but I can't be down that long.
> part that's
> really misleading is the cpu and motherboard temperature...what you
> really need is the hard drive temperature. Granted, it won't
> work if the
> cpu is overheating. But hard drive failure rates go up so
> dramatically
> with just a 10 degree increase it's unbelievable.
I remember some 4GB Micropolis A/V SCSI drives that I used to have that
would almost burn because they were so hot.
If and when I pull the box out of production, I'll get some cheap
thermometers and mount them to the drives. I've actually got a Hobo data
logger with a thermocouple that I could tape to a drive. Since it is a 1U
rackmount (Supermicro) the ventillation system is pretty well designed and
pretty intense. There's a lot of airflow through them.
George Sexton
MH Software, Inc.
http://www.mhsoftware.com/
Voice: 303 438 9585
More information about the LUG
mailing list