[lug] scsi drive, mmap and bus errors
D. Stimits
stimits at idcomm.com
Sun May 5 14:29:40 MDT 2002
The write errors I have known of included bad cables and bad termination
or accidental dual termination in most cases. Compression/decompression
errors I've seen were bad ram. A particular odd case is that the
defective i840 chipset has an IO-APIC problem, and can show up as scsi
bus errors (or even IDE or ethernet...just whatever is used under heavy
hardware irq load). Is this the i840 SMP chipset? There were also some
scsi bugs prior to 2.4.6-pre2 kernel that could do this on some chipsets
(beyond just i840).
D. Stimits, stimits at idcomm.com
Bear Giles wrote:
>
> I'm not sure if this is a hardware or software error... opinions?
>
> In a nutshell, I'm seeing bus errors when I try to read a ~600 MB
> compressed tar file. The actual code (stripped *way* down) is:
>
> main() {
> fd = open(tarball, O_RDONLY);
> fstat (fd, &buf);
> start = mmap(fd, buf.st_size,...);
> for (each file in index) {
> strm.next_in = start + file.offset
> strm.avail_in = buf.st_size - file.offset;
> ...
> inflateInit2(&strm, 15);
> inflateSync(&strm, Z_SYNC_FLUSH);
> inflate(&strm);
> inflateEnd(&strm);
> }
> munmap(start, buf.st_size);
> close(fd);
> }
>
> The point of the code sample is to emphasize that the entire 600 MB
> compressed tar file is mmap'd, then randomly accessed with the assistance
> of a separate index file and read via the zlib library.
>
> I know that sequential access would be much faster when reading the
> entire tarball, but these tools are optimized for those cases where
> you're only extracting a handful of files and the indexed access will
> be faster than a sequential scan even with the relatively high seek
> times on a CD drive.
>
> Anyway, everything is cool on my IDE hard disk and a loopback of the
> ISO image. But when I try to read a physical CD-R from my SCSI drive,
> I get bus errors.
>
> This is annoying, but not critical. It may be due to a problem in the
> Linux SCSI controllers that cause SIGBUS to be thrown instead of blocking
> until the read head can be moved and the disc sector read.
>
> But even with this workaround, I'll still eventually get an error out
> of inflateInit (from its own bus error handler?) and the process grinds
> to a halt anyway. It also looks like the problem clusters around
> the larger files - again consistent with a model that the system just
> throws a SIGBUS when 'read' gets ahead of the hardware.
>
> If this is the case, a workaround to the latter problem may be replacing
> the single zlib buffer with a series of smaller calls. I strongly
> prefer the "single shot" approach because 'extract()' mmap's the output
> file as well, and life is *sooooo* easy. :-)
>
> Has anyone else seen this type of problem, or know the recommended
> workaround? I know it's possible that my hardware is just a little
> flaky, but I don't want to jump to that conclusion if there's a clean
> workaround to the problem.
>
> Bear
> _______________________________________________
> Web Page: http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
More information about the LUG
mailing list