[lug] advice on a problem

Jose Luis Salas joseluis.salas09 at gmail.com
Fri Jul 30 22:28:11 MDT 2010


My first step would be to run a test on different hardware.

You might have a spare workstation that you can re-image from scratch
and load the same apps on it to test the script.

Good luck!


On Fri, Jul 30, 2010 at 10:10 PM, David L. Anselmi <anselmi at anselmi.us>wrote:

> Steve A Hart wrote:
> > Out of the 52 clients, I have 1 system that is frequently locking up for
> > no apparent reason at all.  Roughly 1-3 times per week this system just
> > goes belly up and locks like it lost the NFS connection to the server.
> > The kicker to this is that this system sits next to an exactly identical
> > system (hardware and software setup) which acts completely normal and
> > does not lock up.  All logs on the problem system show no errors of any
> > kind.  Also, both systems are plugged into the same switch so it's not a
> > network issue.
>
> If you swap the wires where they plug into the 2 machines you'll know the
> problem is in the box.
> Otherwise it could be in the cable or switch port.
>
> > My only commonality/trend I see is that when the system locks, the user
> > is running a heavy matlab script that displays 20+ plots one right after
> > the other so that all 20+ plots are visible in 20+ different windows.
> > This script would run successfully multiple times in a week and then at
> > some apparently random point, it locks.  It should be noted that if this
> > same users runs the same code on the identical system, it runs fine
> > every time.
>
> Does it run on an identical system for weeks on end?  Perhaps it's related
> to this particular
> workload when there's an inopportune burst of network traffic.
>
> If an identical system works consistently you could at least swap them and
> solve this user's problem.
>
> Probably there's some debugging you could turn on but I haven't done that.
>  I'd be curious what's
> happening on the network when it locks.
>
> Dave
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: irc.hackingsociety.org port=6667 channel=#hackingsociety
>



-- 
Jose Luis Salas

E-mail: joseluis.salas09 at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20100730/df24a73c/attachment.html>


More information about the LUG mailing list