[lug] [Slightly OT] File Management?

Paul E Condon pecondon at mesanetworks.net
Tue Mar 24 12:49:13 MDT 2009


On 2009-03-24_08:25:22, Rob Nagler wrote:
> Lee Woodworth writes:
> > It sounds like you have looked at rsync and decided it isn't useful
> 
> We use rsync to mirror the data.
> 
> > When you say "SATA backup solutions" do you mean appliance type
> > of units or something like Lefthand Networks iScsi units?
> 
> What you have to look at with all these solutions is "end to end"
> performance for large, varied data sets.  What I've found is that if
> everything is according to some benchmark, everything works great.
> When I use it for my particular application (large & varied), it
> doesn't.  The controllers can only get the data to the right place on
> the disks so fast so I haven't found that controllers are the issue.
> 
> Consider our cp --link solution for the "woops problem".  We have to
> delete the link trees older than say, a week.  This requires a very
> large "rm -rf".  It takes hours.
> 
> > You mentioned system board and controller failures for your own built
> > systems. Were there issues with the SATA disk drives?
> 
> I didn't build the systems.  I bought them from integrators (jncs,
> pcpitstop, and promise).  The SATA disks are great.
> 
> > Copied about 120GB w/ average file size of 50M.
> 
> That's nice, but that's not reality for us.  We have *lots* of small
> files, which change frequently.
> 
> > Write rates were 25MB/s to each disk in the mirror (no caching).
> 
> Was this a clean file system?  If so, I wouldn't be surprised.
> 
> > drives (no NCQ), so I would expect with a new MB with PCI express, decent
> > SATA controller cards (non-raid), and higher performance drives (with real
> > NCQ) the transfer rates would be higher even across multiple mirror sets.
> 
> Again, the CPU or the bus isn't the problem.  It's the physical disk
> postioning that's the problem.  When you employ caching, the
> controller can optimize the write queue.  Without caching, you have to
> wait for the disks to spin around to the write place.  All controllers
> we have tried took longer than 24 hours to do what our backups
> require, unless we turned on caching.  With caching, it still is
> slower than I'd like, but the initial rsync "snapshot" is at network
> speeds so it's good enough.
> 
> Rob

Rob, I'm not an expert, just a guy who likes to think about other
people's problems. It occurs to me that reordering the writes in the
cache to reduce seeks must to some extent requires logic that should
depend on the nature of the nature of the work flow in a particular
situation. Actually that's the nature of your arguement, isn't it. And
there is no logic that can be right for all customers of a disk
supplier, so -- could there be a front end to a disk farm that does
the cacheing in RAN, under the control of a locally developed
algorithm, and in an environment where the admins can monitor the size
of the queue of waiting write requests. In your case, it seems that
hiding this problem inside the disk box is not a good idea. For you,
the needed cache size may be too large for manufacturers to be able to
offer the product at a sustainable price point. 

Mostly, having the reordering being done in a box that can be
monitored to any level of detail desired, can give the admins
important advance warning increasing hardware needs. Do these fancy
hard drives really give adequate diagnostics of their internal operation
when they are operating outside the laboratory? (rhetorical question)

PS I have no problem with your point that physical limitations of
hardware cannot be, always and everywhere, overcome by smart generic
software. It is analogous to the crazy thinking that lead financial
people to believe that a system for each person avoiding risk will
continue to work when that system is used by everybody.

On a slightly different level, I am skeptical of the very idea of
"point in time snapshots". At a level of detail that we can think
about, and may actually be coming into reality, there is no single
definition of time that is valid everywhere. This is a well known
result of the Special Theory of Relativity. More significantly,
because of transmission delays, it is not possible to know of a
commit of a transaction until some time after it has actually
occurred. This can lead to the system letting other transactions
to commit that would have not been allowed had the first commit
been known about promptly. This is not a problem of a mistake
being made because of an inadequate implementation, but because of
an inadequate understanding of the nature of time. Talk of a
"point in time" is evidense of an inadequate understanding of
the nature of time. 

-- 
Paul E Condon           
pecondon at mesanetworks.net



More information about the LUG mailing list