Extended discussion on databases for backup indexes: WAS: Re: [lug] Bacula
Alan Robertson
alanr at unix.sh
Sun Sep 18 23:34:35 MDT 2005
George Sexton wrote:
>>-----Original Message-----
>>From: lug-bounces at lug.boulder.co.us
>>[mailto:lug-bounces at lug.boulder.co.us] On Behalf Of Alan Robertson
>>Sent: Saturday, September 17, 2005 10:11 AM
>>To: Boulder (Colorado) Linux Users Group -- General Mailing List
>>Subject: Re: [lug] Bacula
>
> Clearly you've put a lot of thought into it.
Yeah. I spent a while using a bad database-based package, wound up
writing my own to replace it, and a few years later spent a while
evaluating replacement packages to replace the one I wrote. A couple of
times in my career I've implemented from scratch or worked on b-trees
for databases.
> If I were using a database that could only do 100 inserts per second, I'
> would seriously think about changing professions. Your estimate is off by at
> least one order of magnitude, and probably two. Someone else has pointed
> this out.
>
> I think part of the problem is that you're putting a lot of unnecessary
> operations in place. For example:
>
>>So... If nothing new is being backed up (best case), then
>> For the recycle:
>> walk sequentially through 10 million tuples in FilesTape
>> do 10 million searches by attributes on BackedUpFiles
>> do 10 million in-place updates in BackedUpFiles (ref count)
>> plus 10 million deletes in FilesTape
>
> The first two steps are unnecessary. The reference count field is a crutch
> that denormalizes the table. After the delete, purging files that no longer
> exist becomes:
>
> Delete from backed up files where file_id not in (select distinct file_id
> from filetape)
But, this not exactly a cheap operation on a database with 20-60 million
tuples against another one of 60 million tuples (it's basically a join).
Either you have secondary indices on the file id (bulky and somewhat
expensive on updates), or you sort on the file id - more expensive - but
less bulky.
This is analogous to what I called earlier "doing an audit". It will
take a while to run. In particular, I suspect it will likely be slower
than the earlier proposal with a reference count.
The join operation depends a lot on how much memory have to give your
database, whether you maintain the secondary indices, and how smart your
database's join code is.
> That leaves 10 million deletes. Difficult, but not impossible. There are
> ways of scaling around the transaction limit that I won't go into, but they
> are not that hard.
I didn't mean to imply impossible - just a heck of a lot of work. If
your file sizes are small (like software development source files), then
writing the database can actually be slower than writing the data to
tape - because you can have 20 tape drives writing at once - but only
have one database for indexes. Backing up the data scales up linearly
indefinitely to the limit of raw bandwidth of disks and networks.
> I'm pretty confident these steps can be done in less than the 14 hours or so
> you quoted.
I'm sure you have a lot more experience with modern databases than I do.
It wouldn't be hard to make up a sample database to benchmark this.
Keep in mind that is unlikely that anyone would buy a copy of Oracle or
DB2 for this task - it's just too expensive - and too complex to set up
and tune for a backup solution.
But, writing the index information into flat files _for this case_ is at
least an order of magnitude faster and more reliable. Your average
backup operator isn't exactly a DBA, and when things go wrong at 2 AM,
won't have any idea how to repair database problems. [Backups _only_ go
bad at 2-4 AM ;-)].
This would probably be why Veritas backup doesn't use a database (or
didn't the last time I looked at it).
And, there's a simple trick which makes them roughly as fast to search
as a database for filenames - and faster for non-key retrievals (non-key
meaning not on a key in the database). [Veritas didn't used to to do
this - but it seems obvious to me].
If you're using a find-like operation for traversing your directory
structure for backing up, do this: Sort each directory by file name as
you see it - then your find output comes out in sorted order by pathname
- and back things up in that order, writing the index file as you go.
Then - just like magic, your index file is also in sorted order by
pathname - in less than n ln(n) time (where n is the total number of
files backed up). All you have to do then is do a simple binary search
to find things quickly by "primary key" (i.e., pathname).
With flat files, there is virtually no possibility of database
corruption, and requires no serious planning or fast disks, negligible
CPU and memory, or anything else to make it work fast enough, no
sophisticated technology - nothing but a modest amount of code.
All you have to deal with now is the "essentials" of backups - reading
files and writing tapes.
To be fair - the database approach has advantages to it - you aren't
constrained in what order you back things up - you can have many engines
reading files simultaneously and whenever they get ready, queue for the
tape drive, and write files out in whatever order they become ready -
all without having to worry about the order you back things up.
On the other hand, you can always sort the tape content (index) file at
the end - another relatively low-tech solution. This will be faster
than the sort for the join mentioned above.
I guess the point of the rant is that conventional relational databases
are really good at the things they're good at - but that they're not the
universal panaceas they're sometimes thought to be.
They're well-designed for a certain class of problems. And, some
problems which you might think they're really good at (like database
indexes) - they're probably overkill for, and not exactly stellar in
performance either. They're basically a screwdriver when you needed a
hammer. You can drive a nail in with a screwdriver, but it's kind of
inefficient, and a lot more (computational) work than if you had a hammer.
--
Alan Robertson <alanr at unix.sh>
"Openness is the foundation and preservative of friendship... Let me
claim from you at all times your undisguised opinions." - William
Wilberforce
More information about the LUG
mailing list