[lug] Seeking thoughts on this crash

Gary Hodges Gary.Hodges at noaa.gov
Mon Jan 5 14:50:47 MST 2004


Nick Golder wrote:

>On 2004-01-05 11:41 -0700, Gary Hodges wrote:
>  
>
>>Gary Hodges wrote:
>>    
>>
>>>Peter Hutnick wrote:
>>>      
>>>
>>>>Chuck Morrison wrote:
>>>>        
>>>>
>>>>>While my first inclination would be hardware, testing with the 
>>>>>original - worked once at least - system would point you in the 
>>>>>right direction, I think. Of course if it's software it may be 
>>>>>something other than the kernel settings. There could be libraries 
>>>>>that could be different and cause issues too.
>>>>>          
>>>>>
>>>>Have you eliminated heat (chipset and CPU) as the culprit?
>>>>        
>>>>
>>>After the first crash I started monitoring the CPU temp.  If you 
>>>believe the accuracy of lmsensors it was at 66.9 deg C once and 66.7 
>>>deg C another time.  These are withing a degree that the original 1.4 
>>>Athlon ran at, and the same temp as during my first performance test 
>>>after installing the new CPU.  The HSF is from PC Power and Cooling 
>>>and is rated for up to something like 3200+ CPU's.  I did replace a 
>>>seized chipset fan at the time I replaced the CPU.  I didn't know it 
>>>had seized before I observed all the fans with the case open.  Maybe 
>>>the replacement has too.  I've had problems with other computers in 
>>>the past due to heat, so it is something I'm always worried about.  I 
>>>wish these CPU's ran cooler, but according to specs even 66 deg C is 
>>>well under max operating temps.
>>>      
>>>
>>It looks like heat was playing a part in the crashes.  I changed the 
>>physical location of the computer and the CPU runs ~4 deg C cooler while 
>>processing large amounts of data.  I have reprocessed data several times 
>>now without crashing, so it must be that the CPU was getting too hot.  
>>Before the break it also locked up while in screensaver mode which had 
>>never happened before.  It seems to me that the CPU has become more 
>>sensitive to heat.  I should probably RMA the bugger.
>>    
>>
>
>I just dug though the archives and my quick greps didn't reveal a
>mention of the north bridge.  Do you have a fan on the heatsink of the
>north bridge?  
>
There is a fan on the motherboard, but I don't know if it is the north 
bridge.  Here is a picture of the board:

http://www.giga-byte.com/Motherboard/FileList/ProductImage/photo_7dx_40_big.jpg

>I did a similar upgrade to an Asus motherboard and the
>difference of 2 degrees F would tip the system over the edge.  Dropping
>a fan on the northbridge blowing down over the heatsink and adding a
>case fan that would pull in ouside ambient air that just so happened to
>blow over the north bridge made a huge different.
>
I have two fans blowing in and three blowing out (one is the PS fan).  
The way my computer was situated, heat from the exhaust fans would build 
up behind the case and get drawn back in some lower vent holes. 

Is it reasonable to accept a CPU that operates so close to the temp at 
which it fails?  I've always been a big supporter of AMD CPU's, but this 
heat issue bugs me.  Way back when I dug around the AMD web site and ran 
across a white paper that said the max operating temp of Athlons was 
upwards of 90 deg C.  Makes me wonder about the CPU in my machine.  Of 
course, maybe this has nothing to do with the CPU, but is in fact 
related to the north bridge as you have suggested.  Then again, it never 
happened before I installed this CPU...

Cheers,
Gary




More information about the LUG mailing list