[lug] scsi drive, mmap and bus errors

Bear Giles bear at coyotesong.com
Sat May 4 13:49:21 MDT 2002


I'm not sure if this is a hardware or software error... opinions?

In a nutshell, I'm seeing bus errors when I try to read a ~600 MB 
compressed tar file.  The actual code (stripped *way* down) is:

  main() {
	fd = open(tarball, O_RDONLY);
	fstat (fd, &buf);
	start = mmap(fd, buf.st_size,...);
	for (each file in index) {
	   strm.next_in = start + file.offset
	   strm.avail_in = buf.st_size - file.offset;
	   ...
	   inflateInit2(&strm, 15);
	   inflateSync(&strm, Z_SYNC_FLUSH);
	   inflate(&strm);
	   inflateEnd(&strm);
    }
	munmap(start, buf.st_size);
	close(fd);
  }

The point of the code sample is to emphasize that the entire 600 MB
compressed tar file is mmap'd, then randomly accessed with the assistance
of a separate index file and read via the zlib library.  

I know that sequential access would be much faster when reading the
entire tarball, but these tools are optimized for those cases where
you're only extracting a handful of files and the indexed access will
be faster than a sequential scan even with the relatively high seek
times on a CD drive.

Anyway, everything is cool on my IDE hard disk and a loopback of the
ISO image.  But when I try to read a physical CD-R from my SCSI drive,
I get bus errors.

This is annoying, but not critical.  It may be due to a problem in the
Linux SCSI controllers that cause SIGBUS to be thrown instead of blocking
until the read head can be moved and the disc sector read.

But even with this workaround, I'll still eventually get an error out
of inflateInit (from its own bus error handler?) and the process grinds
to a halt anyway.  It also looks like the problem clusters around
the larger files - again consistent with a model that the system just
throws a SIGBUS when 'read' gets ahead of the hardware.  

If this is the case, a workaround to the latter problem may be replacing
the single zlib buffer with a series of smaller calls.  I strongly
prefer the "single shot" approach because 'extract()' mmap's the output
file as well, and life is *sooooo* easy. :-)

Has anyone else seen this type of problem, or know the recommended
workaround?  I know it's possible that my hardware is just a little
flaky, but I don't want to jump to that conclusion if there's a clean
workaround to the problem.

Bear



More information about the LUG mailing list