[lug] [Slightly OT] File Management?

Rob Nagler nagler at bivio.biz
Tue Mar 24 08:25:22 MDT 2009


Lee Woodworth writes:
> It sounds like you have looked at rsync and decided it isn't useful

We use rsync to mirror the data.

> When you say "SATA backup solutions" do you mean appliance type
> of units or something like Lefthand Networks iScsi units?

What you have to look at with all these solutions is "end to end"
performance for large, varied data sets.  What I've found is that if
everything is according to some benchmark, everything works great.
When I use it for my particular application (large & varied), it
doesn't.  The controllers can only get the data to the right place on
the disks so fast so I haven't found that controllers are the issue.

Consider our cp --link solution for the "woops problem".  We have to
delete the link trees older than say, a week.  This requires a very
large "rm -rf".  It takes hours.

> You mentioned system board and controller failures for your own built
> systems. Were there issues with the SATA disk drives?

I didn't build the systems.  I bought them from integrators (jncs,
pcpitstop, and promise).  The SATA disks are great.

> Copied about 120GB w/ average file size of 50M.

That's nice, but that's not reality for us.  We have *lots* of small
files, which change frequently.

> Write rates were 25MB/s to each disk in the mirror (no caching).

Was this a clean file system?  If so, I wouldn't be surprised.

> drives (no NCQ), so I would expect with a new MB with PCI express, decent
> SATA controller cards (non-raid), and higher performance drives (with real
> NCQ) the transfer rates would be higher even across multiple mirror sets.

Again, the CPU or the bus isn't the problem.  It's the physical disk
postioning that's the problem.  When you employ caching, the
controller can optimize the write queue.  Without caching, you have to
wait for the disks to spin around to the write place.  All controllers
we have tried took longer than 24 hours to do what our backups
require, unless we turned on caching.  With caching, it still is
slower than I'd like, but the initial rsync "snapshot" is at network
speeds so it's good enough.

Rob



More information about the LUG mailing list