[lug] [Slightly OT] File Management?

Nate Duehr nate at natetech.com
Tue Mar 24 16:20:01 MDT 2009


I fully agree, Paul that there are probably outliers (like nuclear weapons
blowing up banks, from your own example) that are so far-fetched, that if
they happen -- humans will have to negotiate reasonable terms to fix the
problem.  Computers don't fix those things, people do.

For the NORMAL day-to-day filesystem events we're all facing as the VAST
MAJORITY of our "time" based problems, we can easily quantify and de-mystify
the business risks of any particular system (not just backups) we choose to
use to operate our businesses. 

Most of these systems are granular down to moments of "time" so small, we
find them hard to quantify as humans... and amounts of data so small that
their loss can easily be recovered by other means when low-risk events like
systems being powered down in the middle of a "transaction" happen.

That's the world sysadmins live in... we have to know and think abstractly
enough to understand the risk, and quantify it into "worst case" scenarios
for those making the business decision to use the technology.

Nothing different there than quantifying the same risks taken when that same
nuclear weapon hits an office building full of people typing up customer's
bills on old-fashioned typewriters.  Is that data loss probably going to be
the most important thing on the company's mind, or are they going to write
off the loss and move on to far more important things if such "crazy" events
happen?

The trick here is to NOT act like computers and the technology are in any
way "mysterious" about this.  They're just machines, with common modes of
failure and common ways of dealing with those failures, all of which can be
monitored and tracked (and has been) over large amounts of time, showing the
path to where the MOST RISK lies.  

Point-in-time transactions, aren't where those are at, today.  In fact, I'd
hedge a VERY large monetary bet that far more data loss occurs by lack of
employee training and careful procedures than any point-in-time backup or
filesystem has ever lost any employer I've ever worked for.   Wouldn't you?

We admins like to discuss this stuff at this low-level, but we overlook the
"plank in our own eye" when all of us have made data-loss mistakes -- well,
anyone who's worked in the sysadmin field for any length of time, and is
HONEST, has... 

How often do companies NOT take steps to avoid such mistakes in the future
other than saying, "Don't do that again."  The machines often can be
programmed to help us avoid human error, but often aren't until the loss
becomes too great for the company to bear and stay in existence.  

So... from a purely logical standpoint -- you're right.  Time is hard to
wrap our head's around.  From a practical standpoint though, getting the
time-based errors out of our systems, really isn't that difficult if we
focus on them and document their limitations.

Nate 

-----Original Message-----


I continue to believe that there is an issue here, but not one
that will likely be resolved by continuing this line of discussion. 

Cheers, 

-- 
Paul E Condon           
pecondon at mesanetworks.net
_______________________________________________
Web Page:  http://lug.boulder.co.us
Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
Join us on IRC: lug.boulder.co.us port=6667 channel=#colug




More information about the LUG mailing list