[lug] System dies occasionally - was: (no subject)

keith.herold at cox.net keith.herold at cox.net
Fri Jun 21 11:51:16 MDT 2002


It turns out I had already turned of apmd because I figured it might be causing the problem.  There apparently are *no* power settings in the BIOS, so either the system does it without control, or...  Well I don't know.

I upgraded the kernel to the latest, and installed lm_sensors; the temperature on the CPUs is fixed at 261.5F, and I haven't crashed so I haven't tried the --noapic option yet.  If this doesn't work, I will try it.

Thanks for the help (and thank god RH finally updated the kernel upgrade howto; prior to this, I have never successfully upgraded the kernel (usually I give up and download the latest distro instead).  The machine does not, will not, and can not have access to the world so sayeth the PTB, so up2date was out ).

--Keith
> 
> From: Nate Duehr <nate at natetech.com>
> Date: 2002/06/21 Fri AM 11:57:46 EDT
> To: lug at lug.boulder.co.us
> Subject: Re: [lug] System dies occasionally - was: (no subject)
> 
> No one has mentioned this one yet...
> 
> Are there any power "saving" modes turned on in the BIOS?  Apmd running?
> 
> Perhaps a bug in one of those and the machine thinks it's "asleep"?
> 
> Just another thought...
> 
> Nate
> 
> On Thu, 2002-06-20 at 08:31, Ian S. Nelson wrote:
> > I think all Intel chipsets have it built in.  Especially later model 
> > high power consumption chips.  Your BIOS will tell you the temperature 
> > of the CPUs, after it happens, reset and check it.
> > 
> > So the box just appears to go to sleep?  I'd look at the power supply 
> > first.    
> > 
> > My second thoughts would be if the CPU has some kind of themal 
> > protection built in.  I believe the newer AMDs will shut themselves down 
> > if they get too hot,  I'd assume Intel chips would have something similar.
> > 
> > I'd also check for BIOS updates that might be available.  I had a dual 
> > Pentium 3 which had terrible serial port problems, the box would 
> > randomly lock up when I pushed a fair amount of data over it (just 
> > freeze hard, screen would stay on) and Tyan had a BIOS update that made 
> > the problem got away.  This doesn't sound like a software lock up 
> > though, it sound more like the hardware is shutting down for some reason.  
> > 
> > Ian
> > 
> > 
> > keith.herold at cox.net wrote:
> > 
> > >Intel, but I don't think this machine has a temperature monitor on the cpu's...
> > >
> > >--Keith
> > >  
> > >
> > >>From: Rob Riggs <rob at pangalactic.org>
> > >>Date: 2002/06/19 Wed PM 09:41:40 EDT
> > >>To: lug at lug.boulder.co.us
> > >>Subject: Re: [lug] System dies occasionally - was: (no subject)
> > >>
> > >>AMD or Intel? Have you tried logging your CPU temp to see if it's 
> > >>overheating? I read somewhere (sorry, no reference) of some dual-Athlon 
> > >>systems behaving this way when they overheat.
> > >>
> > >>-Rob
> > >>
> > >>keith.herold at cox.net wrote:
> > >>
> > >>    
> > >>
> > >>>Howdy!  I have just started using a dual processor machine (2x 1.7G, w/3G RAM) and RH 7.1 .  Install works fine and the smp kernel (2.4.7-10smp ) appears to work, but occasionally (randomly, and not triggered by a particular application) the system goes dead (the monitor displays the 'no signal' warning) and will not wake up (the HD light just blinks, and the power light on the box is off).  I need to be able to run 8 hour+ processes (using both 
> > >>>CPU's), but it doesn't seem to be stable.
> > >>>
> > >>>I worked at another place that had a similar problem (not the same manufacturer though) and they were never able to resolve it.  Can anyone here help?
> > >>>
> > >>>      
> > >>>
> > >>_______________________________________________
> > >>Web Page:  http://lug.boulder.co.us
> > >>Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> > >>Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
> > >>
> > >>    
> > >>
> > >
> > >_______________________________________________
> > >Web Page:  http://lug.boulder.co.us
> > >Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> > >Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
> > >  
> > >
> > 
> > 
> > 
> > _______________________________________________
> > Web Page:  http://lug.boulder.co.us
> > Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> > Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
> -- 
> Nate Duehr, nate at natetech.com
> 
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
> 




More information about the LUG mailing list