[lug] More Server Problems

D. Stimits stimits at comcast.net
Wed Feb 15 19:56:38 MST 2006


George Sexton wrote:
> My server tanked again (reference):
> 
> http://archive.lug.boulder.co.us/Week-of-Mon-20060123/031529.html
> 
> Since that thread I upgraded to 2.6.14.6 kernel version and I'm still having
> the same issue. After the server crashes (message unknown but I'm guessing
> its ReiserFS file system corruption related) I have to run reiserfsck and
> use the --rebuild-tree option.
> 

I'm curious if you are using default journal options, or if you've 
experimented with things like placing journals on other disks? If not, I 
was just thinking that a bad block right at the journal might be 
difficult for the system to recover from...maybe if you can test 
different journal options.

> I ran memtest86 version 3.2 through 4 complete cycles and found no memory
> issues. I also checked the hardware monitoring from the bios. It looks like
> the temperature is well within reason for CPU and motherboard (31-37 C).

memtest86 4 times is nice, but probably not definitive. The part that's 
really misleading is the cpu and motherboard temperature...what you 
really need is the hard drive temperature. Granted, it won't work if the 
cpu is overheating. But hard drive failure rates go up so dramatically 
with just a 10 degree increase it's unbelievable.

...


D. Stimits, stimits AT comcast DOT net



More information about the LUG mailing list