[lug] ssh compression

Bear Giles bgiles at coyotesong.com
Wed May 1 18:34:02 MDT 2002


> two somewhat pathological examples follow.

I agree - those are pathological examples!  You can only really
compare compression performance with the type of data that you'll
actually be compressing.  The pathological cases may be as misleading
as saying that "trees don't give you any benefit over linked lists"
when you only test them with data that's already sorted.

This is especially true with compression libraries.  We know, from
the pigeonhole principal, that there must be a lot of files that are
larger compressed than uncompressed.  This is unlikely with highly
redundent files like "all zero" or even English text.  But you might
see it with psuedorandom data, or data previously compressed with
a different compression engine.

In a completely orthogonal direction, there can also be some 
other important considerations that totally raw speed or extent of
compression.  E.g., I've been working on a CD-R backup utility off-and-on
for some time, and I'm using ZLIB (gzip) compression where I flush
the compression buffer before each file.  This adds about 5% to the
size of the tar file, but it allows me to recover from damaged
media and to "seek" within the compressed archive.

This probably isn't an issue here - TCP/IP should guarantee that
the compressed file was accurately transmitted, but it's something
to consider when you're creating files for removable media.

Bear



More information about the LUG mailing list