[lug] Setting up failover in Linux?

Sun Apr 29 18:17:49 MDT 2012

>   Putting /usr on DRBD is almost certainly not what you want.  You may
>   think it is, but it's not.  :-)  You really, REALLY, need to think about
>   just what the components you need are, and only those should exist on
>   DRBD.

That's my general approach: know what needs to be replicated.  I was
just throwing it out there.

>   system and if there were a hardware failure or the like during the
>   maintenance you would need to manually resolve it.

That's the odd part to me.  I understand it, but it means I need to
design around this.  I don't have a problem with it (we're mostly
done), but it does make me suspicious of all the hype around HA from
vendors like Amazon.

> virtualization environments that provide fail-over at the virtual system
> level.  So, you treat the virtual as just a regular system, and the
> virtualization HA will fail over in the event of hardware problems, etc...

I have a hard time believing that these systems actually work when you
really need them.  I'm sure they are fine if there are external events
like network or power outages, but I'm not so sure they work when the
rubber meets the road: your system fails during peak activity (when
systems usually fail).  I don't see how they'd coordinate all the data
so that, say, Xapian wouldn't be corrupt (has happened to me) or
Postgres is in sync with the application software changes.

I haven't actually heard on this list that someone was using an HA
system and what the application is that survived the failover and when
it happened.  I do want it to be like RAID in that it handles the real
life event that disks fail.  In the application space, you have the
real life even that RAID controllers, power supplies, and CPUs fail.
Much rarer than disks, but it still happens.

> In most cases, a fail-over is the same as a reboot.  So it's not as fast as

Not interested in fast.  I want accurate.  I want to know what the
state of the secondary is when the failover has to happen.  I don't
really want automatically failover, because I've been trying to figure
out the network partition problem for 30 years, and dammit, I still
can't figure it out.  :) n

> if you design your application for the HA system.  For example, we have a
> router setup that will fail over without losing any packets...

That I get.

BTW, that's what we had at Tandem, but of course, that was ages ago,
and we forgot all about how to build reliable transaction systems.

Seriously, I do think the problem is solvable if you solve it globally
with a transaction manager, and that's sort of the solution we are
working on.  It's just that I would have thought people had done that,
which is the point of my questions.  Relying on SANs and redundant VMs
is not transactional, and I think flawed.  I am pretty sure our system
would survive such a failure, but it might, for example, miss a file
in Xapian that was added to Postgres.

> transaction-level and consistent across locations, you have to deal with
> the latency between those locations...

The trick with transaction-level is to have system-wide replay.  I
haven't seen that except at Tandem.  It's expensive, because
everything has to be a transaction, even an operating system upgrade.
However, it is the "right thing" to do, if you want robust
secondaries.

Rob