[lug] More Server Problems

Sebastian Sobolewski spsobole at thirdmartini.com
Wed Feb 15 21:50:36 MST 2006


On Wednesday 15 February 2006 07:56 pm, D. Stimits wrote:
> George Sexton wrote:
> > My server tanked again (reference):
> >
> > http://archive.lug.boulder.co.us/Week-of-Mon-20060123/031529.html
> >
> > Since that thread I upgraded to 2.6.14.6 kernel version and I'm still
> > having the same issue. After the server crashes (message unknown but I'm
> > guessing its ReiserFS file system corruption related) I have to run
> > reiserfsck and use the --rebuild-tree option.
>
> I'm curious if you are using default journal options, or if you've
> experimented with things like placing journals on other disks? If not, I
> was just thinking that a bad block right at the journal might be
> difficult for the system to recover from...maybe if you can test
> different journal options.
>
> > I ran memtest86 version 3.2 through 4 complete cycles and found no memory
> > issues. I also checked the hardware monitoring from the bios. It looks
> > like the temperature is well within reason for CPU and motherboard (31-37
> > C).
>
> memtest86 4 times is nice, but probably not definitive. The part that's
> really misleading is the cpu and motherboard temperature...what you
> really need is the hard drive temperature. Granted, it won't work if the
> cpu is overheating. But hard drive failure rates go up so dramatically
> with just a 10 degree increase it's unbelievable.
>
FYI:
smartctl  ( http://smartmontools.sourceforge.net/ ) has the ability to display 
drive temperatures for drives that support this feature.

-Seb

> ...
>
>
> D. Stimits, stimits AT comcast DOT net
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#colug



More information about the LUG mailing list