[lug] More Server Problems
D. Stimits
stimits at comcast.net
Wed Feb 15 19:56:38 MST 2006
George Sexton wrote:
> My server tanked again (reference):
>
> http://archive.lug.boulder.co.us/Week-of-Mon-20060123/031529.html
>
> Since that thread I upgraded to 2.6.14.6 kernel version and I'm still having
> the same issue. After the server crashes (message unknown but I'm guessing
> its ReiserFS file system corruption related) I have to run reiserfsck and
> use the --rebuild-tree option.
>
I'm curious if you are using default journal options, or if you've
experimented with things like placing journals on other disks? If not, I
was just thinking that a bad block right at the journal might be
difficult for the system to recover from...maybe if you can test
different journal options.
> I ran memtest86 version 3.2 through 4 complete cycles and found no memory
> issues. I also checked the hardware monitoring from the bios. It looks like
> the temperature is well within reason for CPU and motherboard (31-37 C).
memtest86 4 times is nice, but probably not definitive. The part that's
really misleading is the cpu and motherboard temperature...what you
really need is the hard drive temperature. Granted, it won't work if the
cpu is overheating. But hard drive failure rates go up so dramatically
with just a 10 degree increase it's unbelievable.
...
D. Stimits, stimits AT comcast DOT net
More information about the LUG
mailing list