Extended discussion on databases for backup indexes: WAS: Re: [lug] Bacula

Sun Sep 18 23:34:35 MDT 2005

George Sexton wrote:
>>-----Original Message-----
>>From: lug-bounces at lug.boulder.co.us 
>>[mailto:lug-bounces at lug.boulder.co.us] On Behalf Of Alan Robertson
>>Sent: Saturday, September 17, 2005 10:11 AM
>>To: Boulder (Colorado) Linux Users Group -- General Mailing List
>>Subject: Re: [lug] Bacula
> 
> Clearly you've put a lot of thought into it.

Yeah.  I spent a while using a bad database-based package, wound up 
writing my own to replace it, and a few years later spent a while 
evaluating replacement packages to replace the one I wrote.  A couple of 
times in my career I've implemented from scratch or worked on b-trees 
for databases.

> If I were using a database that could only do 100 inserts per second, I'
> would seriously think about changing professions. Your estimate is off by at
> least one order of magnitude, and probably two. Someone else has pointed
> this out.
> 
> I think part of the problem is that you're putting a lot of unnecessary
> operations in place. For example:
> 
>>So... If nothing new is being backed up (best case), then
>>   For the recycle:
>>       walk sequentially through 10 million tuples in FilesTape
>>       do 10 million searches by attributes on BackedUpFiles
>>       do 10 million in-place updates in BackedUpFiles (ref count)
>>       plus 10 million deletes in FilesTape
> 
> The first two steps are unnecessary. The reference count field is a crutch
> that denormalizes the table. After the delete, purging files that no longer
> exist becomes:
> 
> Delete from backed up files where file_id not in (select distinct file_id
> from filetape)

But, this not exactly a cheap operation on a database with 20-60 million 
tuples against another one of 60 million tuples (it's basically a join). 
  Either you have secondary indices on the file id (bulky and somewhat 
expensive on updates), or you sort on the file id - more expensive - but 
less bulky.

This is analogous to what I called earlier "doing an audit".  It will 
take a while to run.  In particular, I suspect it will likely be slower 
than the earlier proposal with a reference count.

The join operation depends a lot on how much memory have to give your 
database, whether you maintain the secondary indices, and how smart your 
database's join code is.

> That leaves 10 million deletes. Difficult, but not impossible. There are
> ways of scaling around the transaction limit that I won't go into, but they
> are not that hard.

I didn't mean to imply impossible - just a heck of a lot of work.  If 
your file sizes are small (like software development source files), then 
writing the database can actually be slower than writing the data to 
tape - because you can have 20 tape drives writing at once - but only 
have one database for indexes. Backing up the data scales up linearly 
indefinitely to the limit of raw bandwidth of disks and networks.

> I'm pretty confident these steps can be done in less than the 14 hours or so
> you quoted.

I'm sure you have a lot more experience with modern databases than I do. 
  It wouldn't be hard to make up a sample database to benchmark this. 
Keep in mind that is unlikely that anyone would buy a copy of Oracle or 
DB2 for this task - it's just too expensive - and too complex to set up 
and tune for a backup solution.

But, writing the index information into flat files _for this case_ is at 
least an order of magnitude faster and more reliable.  Your average 
backup operator isn't exactly a DBA, and when things go wrong at 2 AM, 
won't have any idea how to repair database problems.  [Backups _only_ go 
bad at 2-4 AM ;-)].

This would probably be why Veritas backup doesn't use a database (or 
didn't the last time I looked at it).

And, there's a simple trick which makes them roughly as fast to search 
as a database for filenames - and faster for non-key retrievals (non-key 
meaning not on a key in the database).  [Veritas didn't used to to do 
this - but it seems obvious to me].

If you're using a find-like operation for traversing your directory 
structure for backing up, do this:  Sort each directory by file name as 
you see it - then your find output comes out in sorted order by pathname 
- and back things up in that order, writing the index file as you go.

Then - just like magic, your index file is also in sorted order by 
pathname - in less than n ln(n) time (where n is the total number of 
files backed up).  All you have to do then is do a simple binary search 
to find things quickly by "primary key" (i.e., pathname).

With flat files, there is virtually no possibility of database 
corruption, and requires no serious planning or fast disks, negligible 
CPU and memory, or anything else to make it work fast enough, no 
sophisticated technology - nothing but a modest amount of code.

All you have to deal with now is the "essentials" of backups - reading 
files and writing tapes.

To be fair - the database approach has advantages to it - you aren't 
constrained in what order you back things up - you can have many engines 
reading files simultaneously and whenever they get ready, queue for the 
tape drive, and write files out in whatever order they become ready - 
all without having to worry about the order you back things up.

On the other hand, you can always sort the tape content (index) file at 
the end - another relatively low-tech solution.  This will be faster 
than the sort for the join mentioned above.

I guess the point of the rant is that conventional relational databases 
are really good at the things they're good at - but that they're not the 
universal panaceas they're sometimes thought to be.

They're well-designed for a certain class of problems.  And, some 
problems which you might think they're really good at (like database 
indexes) - they're probably overkill for, and not exactly stellar in 
performance either.  They're basically a screwdriver when you needed a 
hammer.  You can drive a nail in with a screwdriver, but it's kind of 
inefficient, and a lot more (computational) work than if you had a hammer.

-- 
     Alan Robertson <alanr at unix.sh>

"Openness is the foundation and preservative of friendship...  Let me 
claim from you at all times your undisguised opinions." - William 
Wilberforce