[lug] Reliable SATA server?

Sun Apr 29 18:05:41 MDT 2012

Hi Sean,

Good stuff, thanks. :)

> The way I'm reading your message, it sounds like the major case where you
> are having problems is with the disc sub-system being saturated.

The problem isn't throughput; it's corruption or failure.  I do now
know that cheaper SATA drives have this problem with ERC timeouts, and
that confused the RAID controller.  That's probably happening due to
the amount of data going through the system.  Buying more expensive
SATA drives allows me to adjust the ERC timeouts.  I have heard the WD
drives allow this on their low-end drives, but given the price
difference, there's little point, in not going with the more expensive
drives.  I always thought the theory behind RAID was the "I", but I
guess I'm a bit naive. ;-)

> For hardware we've been fairly happy with Supermicro.

I was running these for a while, too, but they had weird failures,
including backplane.  I ran a few other devices, but all seem to have
problems.

> If you're willing to go with SAS 15K drives, you might as well just go with
> SSDs...  They're similarly priced per GB, and *WILL* solve your I/O
> starvation...

> However, you may want to consider abandoning the rsync+"cp -l" back
> approach.  Hard-links are lots of small random I/Os, which hard drives are
> extremely terrible at.

Backup systems are interesting to me.  The number one use for them is
the "woops" factor, e.g. Rob accidentally deletes an important file.
With cp --link (long names let the code be self-documenting :-), I can
ssh into the backup machine, do a find or just scp, and the file is
restored.  No muss, no fuss.

> rsync+"cp -l" is fine for small home backups, but it's really not suitable
> for anything more serious, for reasons you are running into right now.

I have seen more expensive and complex backup systems fail horribly.
We've been backing up this way for 15 years, and we store quite a bit
of data so I have to say it works and it's less filling. ;-)   The
cool thing about rsync is that it is extremely reliable, because it is
so widely used.  It's probably the most widely used backup command out
there.  For example, Carbon Copy Center relies on rsync over ssh.
It's why we use Apache for our process manager, and why we use Linux
for our operating system. :)

> Have you considered switching from "cp -l" to something like ZFS and using
> snapshots instead of "cp -l" to preserve the history?

Here's how our system works:

* rsync to the primary
* rsync  to the secondaries
* Redundantly, use vendor's backup solution
* We have run standby db's and are implementing a new strategy (see HA thread)

On all backup instances:
* cp --link
* tar up weekly/monthly
* Copy tar balls to offline disk
* Take offline disks to two vaults periodically

I would say that's a little more than your average home backup system...

> We run our backups,
> on commodity drives and the above chassis, using zfs-fuse under Linux, and
> it works exceptionally well.

I have this thing about backups: I want them to be extremely reliable.
 I'm the CEO, and I handle them personally, because my entire business
rests on them.  If there's a bug in our software, which causes a slow
corruption that goes unnoticed, I want to be able to get that data
back.  Again, that's the primary use for backups in my experience.

> Also, how many backups do you have running in parallel?

Everything is sequential over the net.  The databases are exported in
parallel, and the rsync's happen.

> Also, do you run the "cp -l" at the same time that the "rsync" is running?

No.

> Ditto for the removal of historic copies?

No.

> We do the historic archive removal outside of the backup window.  It is
> more expensive than the snapshot creation, because it has to unlink
> no-longer-reachable data blocks, but still probably a few orders of
> magnitude less expensive than removing a "cp -l" archive.

The "rm -rf" works pretty well, but does take quite a while.

> But really, it's not the servers so much as how you are using them that is
> the problem.

I guess I beg to differ.  We have tested servers (RAID
controllers/busses, actually), which simply couldn't keep up with the
load, which other equivalently priced servers could handle.  I went
through several different types of servers before settling on the
current ones, which are fairly reliable as long as I don't put two
CPUs in them. That suggests the drivers are buggy.  The servers hang.
We ended up using some of the dual CPU boxes in JBOD mode and rebooted
them right before they were needed, taking the disks offline
automatically helped, too. I have never had corruption/hangups on our
Dell servers, no matter how hard we drove them, and at times, we drive
them pretty hard.  I was just "saving money" with the cheaper,
whitebox servers.  Oh, and these machines are running at ViaWest in
their class A facility.

I've been working with computers far too long to accept "how you are
using them" as the problem.  I have been amazed at the crappiness of
some hardware I've used, such as IBM Blade Centers, which remind me of
the good old days of Dinosaur Husbandry.  Computers should work no
matter what load you put on them, unless you are thrashing iwc they
should just be slow, not corrupt your data or break.  When you see
MTBF's of 1.2M hours on disk drives, it should mean something.
Indeed, I do see that on my dell boxes with my old SCSI disks.   With
these cheaper SATA drives, I expect more frequent failures, but not
failures in RAID controllers, which I have also seen on too many
different brands, except Dell.  While Dell does source from a variety
of RAID vendors, I have a feeling they test the heck out of the
controllers they choose.  I've heard horror stories about Dell boxes,
but I just haven't experienced that.

Anyway, I'm looking at getting refurb Dell boxes which will have
expensive SATA drives in them, not SAS.

Rob