[lug] Reliable SATA server?

Sean Reifschneider jafo at tummy.com
Sun Apr 29 16:13:29 MDT 2012


On 04/21/2012 09:03 AM, Rob Nagler wrote:
> Over the years we've struggled to find a reliable SATA server for our
> backups.  I have tried numerous versions of SATA computers, and they
> all seem to fail under our loads, always.  The failure is usually in

I designed our backup infrastructure, handling hundreds of Linux systems,
and it has been working extremely well.  Our current, 4th generation backup
system has been in production for almost 2 years, so I feel like I can
speak with some authority here...  :-)

The way I'm reading your message, it sounds like the major case where you
are having problems is with the disc sub-system being saturated.  Though
you also mention having problems with bays and drives failing.  Well,
drives fail (though we've been quite happy with Hitachi, we don't run into
many drive failures, but we do expect them).

For hardware we've been fairly happy with Supermicro.  Our backup servers
are these:

   http://www.supermicro.com/products/system/2U/6025/SYS-6025B-3.cfm

These are older machines now.  I think of a few dozen, we've had one that
had a backplane failure.

However, this will *NOT* solve an I/O starvation problem.

If you're willing to go with SAS 15K drives, you might as well just go with
SSDs...  They're similarly priced per GB, and *WILL* solve your I/O
starvation...

However, you may want to consider abandoning the rsync+"cp -l" back
approach.  Hard-links are lots of small random I/Os, which hard drives are
extremely terrible at.

rsync+"cp -l" is fine for small home backups, but it's really not suitable
for anything more serious, for reasons you are running into right now.  I
had played with it at home many years ago, but even for home backups I
found it to be far too resource-intensive.  Though I'll admit that my home
backup situation is more serious than many...

Have you considered switching from "cp -l" to something like ZFS and using
snapshots instead of "cp -l" to preserve the history?  We run our backups,
on commodity drives and the above chassis, using zfs-fuse under Linux, and
it works exceptionally well.  Though it *IS* fairly RAM-intensive -- we
have 16GB RAM in our backup servers, but have it running for smaller setups
with only 4GB or 8GB RAM.

Also, how many backups do you have running in parallel?  We run at most 6
per system, I think.

Also, do you run the "cp -l" at the same time that the "rsync" is running?
Ditto for the removal of historic copies?  If so, you probably want to
defer those to running outside of your backup window.  We run the snapshot
creation during the backup window, but snapshots are far, far less
expensive than the "cp -l" approach.

We do the historic archive removal outside of the backup window.  It is
more expensive than the snapshot creation, because it has to unlink
no-longer-reachable data blocks, but still probably a few orders of
magnitude less expensive than removing a "cp -l" archive.

But really, it's not the servers so much as how you are using them that is
the problem.

Sean



More information about the LUG mailing list