[lug] Binary compatibility and segfaults

Zan Lynx zlynx at acm.org
Mon Sep 18 10:36:29 MDT 2006


On Sun, 2006-09-17 at 23:00 -0600, Daniel Webb wrote:
> I'm running a lot of simulations for some research, and I'm trying to use as
> many computers as possible.  Something strange though: one of my machines
> seems to have something wrong somewhere, and I'm at a loss.
> 
> If I do variable dumps prinf style as the program runs, it's clear the data in
> the variables is off by one from what it should be (if a=1, b=2, then when run
> on this machine b == 1 and a == something unknown.  
> 
> I even tried to compile the code on this machine, and it compiles without
> error, but still segfaults when run.  I'm running this same code on three
> other machines with no problems (and it runs valgrind clean).
> 
> I thought corruption, but I did a debsums check of everything on the system,
> and I put a new kernel in from another machine.  Still the same.  Any ideas?
> 
> This is a Debian stable system and the problem happened with both a 2.4.27
> kernel and 2.6.15 kernel.

The first thing that I would do is run a memory test.  Memtest86+ is a
good one.

Then perhaps make a list of every hardware difference the machine has
from the other machines and try to test something about each one.

Or maybe run the code step by step through its machine code and try to
spot the divergence from proper execution.
-- 
Zan Lynx <zlynx at acm.org>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lug.boulder.co.us/pipermail/lug/attachments/20060918/347a4b37/attachment.pgp>


More information about the LUG mailing list