[lug] Bacula

Sun Sep 18 08:53:04 MDT 2005

> -----Original Message-----
> From: lug-bounces at lug.boulder.co.us 
> [mailto:lug-bounces at lug.boulder.co.us] On Behalf Of Alan Robertson
> Sent: Saturday, September 17, 2005 10:11 AM
> To: Boulder (Colorado) Linux Users Group -- General Mailing List
> Subject: Re: [lug] Bacula

Clearly you've put a lot of thought into it.

If I were using a database that could only do 100 inserts per second, I'
would seriously think about changing professions. Your estimate is off by at
least one order of magnitude, and probably two. Someone else has pointed
this out.

I think part of the problem is that you're putting a lot of unnecessary
operations in place. For example:

> So... If nothing new is being backed up (best case), then
>    For the recycle:
>        walk sequentially through 10 million tuples in FilesTape
>        do 10 million searches by attributes on BackedUpFiles
>        do 10 million in-place updates in BackedUpFiles (ref count)
>        plus 10 million deletes in FilesTape

The first two steps are unnecessary. The reference count field is a crutch
that denormalizes the table. After the delete, purging files that no longer
exist becomes:

Delete from backed up files where file_id not in (select distinct file_id
from filetape)

That leaves 10 million deletes. Difficult, but not impossible. There are
ways of scaling around the transaction limit that I won't go into, but they
are not that hard.

I'm pretty confident these steps can be done in less than the 14 hours or so
you quoted.

>    For backing up:
>        do 10 million search by keys - on a complex key in FilesTape
>        plus 10 million in-place updates in BackedUpFiles (ref count)
>        plus 10 million inserts inserts in FilesTape

Doing the lookup on the files MD5SUM cuts the number of candidates for the
complex attribute search down to very few records. For the DB to manually
examine the collisions for the remaining attributes would be pretty simple.

I've discussed dumping the reference count already.

You really can't convince me that the insert of one record into FilesTape is
going to take longer than it will to stream the actual file to a backup
device. 

Running the key search in BackedUpFiles, and doing the insert into FilesTape
can also be done in a separate thread. Since neither one is likely to take
longer than streaming the actual file to a backup device, this thread would
be idle most of the time.

George Sexton
MH Software, Inc.
http://www.mhsoftware.com/
Voice: 303 438 9585

> -----Original Message-----
> From: lug-bounces at lug.boulder.co.us 
> [mailto:lug-bounces at lug.boulder.co.us] On Behalf Of Alan Robertson
> Sent: Saturday, September 17, 2005 10:11 AM
> To: Boulder (Colorado) Linux Users Group -- General Mailing List
> Subject: Re: [lug] Bacula
> 
> George Sexton wrote:
> >>-----Original Message-----
> >>From: lug-bounces at lug.boulder.co.us 
> >>[mailto:lug-bounces at lug.boulder.co.us] On Behalf Of Alan Robertson
> >>Sent: Wednesday, September 14, 2005 10:50 PM
> >>To: Boulder (Colorado) Linux Users Group -- General Mailing List
> >>Subject: Re: [lug] Bacula
> >>
> >>
> >>A simple example:
> >>	If you want to back up 10 million small files, that means
> >>	probably 10 million inserts to a database - which is a HUGE
> >>	amount of work - and incredibly slow when compared to writing a
> >>	10 million records to a flat file.
> >>	And when you recycle a backup, that means 10 million deletions.
> >>
> > 
> > Actually, for the single backup case, your argument is 
> sound. For the second
> > backup, it's wrong. Consider a table structure:
> > 
> > BackedUpFiles
> > -------------
> > File_ID
> > FileName
> > Size
> > Owner
> > Attributes
> > Date
> > MD5SUM
> 
> AND very importantly you need a reference count - or you can never 
> delete anything from this relation.  And, this relation has to be 
> indexed by anything you want to search on.  This includes AT 
> LEAST the 
> file id, and the combination of (FileName, Size, Owner, 
> Attributes, Date 
> and MD5sum).  Or you could add another field which you might 
> attribute_sum which is the md5 sum of all the fields except the 
> reference count.
> 
> > Tape
> > ------------
> > Tape_ID
> > InitialUseDate
> > UseCount
> > Etc.
> > 
> > FilesTape
> > ---------
> > File_ID
> > Tape_ID
> > 
> > It becomes pretty trivial to enumerate the tapes a specific 
> file is on. For
> > the 2nd and subsequent backups, the space usage is vastly 
> more efficient.
> > 
> > I'll give you that removing 10 million entries from 
> FilesTape is a job, but
> > the advantage of being able to quickly enumerate the tapes 
> a specific file
> > is on vastly outweigh that issue.
> 
> Plus the 10 million inserts.  If each averages a lightning fast 10ms 
> (which it won't), then updating the indexes will take about 
> 28 hours - 
> 14 of which typically has to happen before the backup can 
> start.  This 
> may be a bit problematic for your average set of fulls over 
> the weekend. 
>   And, this has been my personal experience in exactly such 
> circumstances.  Writing the corresponding flat files happens 
> faster than 
> the tapes can be written.  Deleting a 30mb file takes a fraction of a 
> second.
> 
> And, on the average - you never refer to these records again.
> 
> >>From a space perspective, the second backup a database file 
> name would use
> > something like 30MB additional to store the contents while 
> the flat file
> > would probably use something like:
> > 
> > 10^6*(80(avg file name length)+6(owner and attributes)+8(file
> > size)+8(MD5SUM)+8(file date)+10(delimiters and CRLF)
> 
> 
> I've never seen a schema like yours used in practice.  You 
> have to also 
> keep a reference count for the BackedUpFiles database, and 
> you have to 
> hope you keep it correct - in the case of system crashes etc. 
>  So, every 
> so often, you have to walk through the whole set and audit 
> them to see 
> that they're consistent every time you crash during a backup or a 
> recycle operation.
> 
> To recycle here's what you have to do:
>         walk through the FilesTape relation by Tape keyand delete 10
>            million tuples
>         Do 10 million lookups on File_ID in BackedUpFiles, and
>           decrement 10 million reference counts (fortunately this
>           doesn't affect the indexes)
>         Delete any tuples in BackedUpFiles whose reference count
>           is now zero (updating >= two indexes).
> 
> To back up, here's what you have to do
>         Before each file is backed up, you have look up its
>          (md5sum,File_ID, FileName, Size, Owner, Attributes, Date)
>         to see if you already have a fileid for that file
>         If you don't, then get a new file id (increment counter
>           probably, and sync that counter out to disk (in 
> case of crash)
>         If you already have one, you have to add a record to the
>           update the reference count in the BackedUpFiles relation
>         Insert a record into the FilesTape relation
> 
> So... If nothing new is being backed up (best case), then
>    For the recycle:
>        walk sequentially through 10 million tuples in FilesTape
>        do 10 million searches by attributes on BackedUpFiles
>        do 10 million in-place updates in BackedUpFiles (ref count)
>        plus 10 million deletes in FilesTape
>    For backing up:
>        do 10 million search by keys - on a complex key in FilesTape
>        plus 10 million in-place updates in BackedUpFiles (ref count)
>        plus 10 million inserts inserts in FilesTape
> 
> If everything being backed has at least an attribute changed, 
> every time 
> you back up (worst case) then you will do this:
>     For the recycle:
>        Walk sequentially through 10 million tuples in FilesTape
>        do 10 million searches by complex key on BackedUpFiles
>        do 10 million deletes in BackedUpFiles
>        do 10 million deletes in FilesTape
>     For backing up:
>        do 10 million search by keys - on complex key in FilesTape
>        plus 10 million inserts in BackedUpFiles
>        plus 10 million inserts in FilesTape
> 
> If you keep 6 weeks of fulls online, then FilesTape will have 
> 60 million 
> records in it (best or worst case) - making massive inserts 
> and deletes 
> into it really expensive.
> 
> In the best case from above, you have 10 million tuples in 
> BackedUpFiles.  In the worst case, you have a relation of 60 
> million tuples.
> 
> 
> Best case:
>        read 10 million tuples sequentially by key
>        do 10 million search by attributes
>        do 10 million replace operations
>        10 million deletes
> 
>        10 million searches by key
>        10 million more record updates
>        10 million inserts
> 
> If you make the recycle a transaction and writing a single tape a 
> transaction, then  you have some extremely large transactions 
> which will 
> take a very long time to commit.
> 
> Let's say the sequential reads take .1ms each, the search by 
> key take 3 
> ms each, the replaces take 3 ms each, inserts take 4 ms, and deletes 
> take 5 ms, then here's the time:
> 
>        read 10 million tuples sequentially by key:	 1000 s.
>        do 10 million search by attributes		30000 s.
>        do 10 million replace operations			30000 s.
>        10 million deletes				50000 s.
> 
>        10 million searches by key			30000 s.
>        10 million more replace operations		30000 
> s.		
>        10 million inserts				40000 S.
> 
> This is about 58 hours - for the best case - where nothing has really 
> changed.
> 
> 
>        read 10 million tuples sequentially by key:	 1000 s.
>        do 10 million search by attributes		30000 s.
>        do 10 million deletes				50000 s.
>        10 million more deletes				50000 s.
> 
>     For backing up:
>        do 10 million search by keys			30000 s.
>        plus 10 million inserts in BackedUpFiles		40000 s.
>        plus 10 million inserts in FilesTape		40000 s.
> 
> This is nearly 70 hours.
> 
> If you make the recycle one transaction and the backup one 
> transaction, 
> then most of the delay for committing the backup transaction 
> comes after 
> the backup is completed.  If you break it up into one transaction per 
> tape, it's a bit better - but still this database work will likely be 
> going on for several days into the week - so the files backed 
> up won't 
> be available until tuesday or so...
> 
> Notice that this doesn't scale very well.  It's primarily 
> bound by disk 
> write speed - and this is very hard to speed up in reality.
> 
> You've made your backup problem into an enterprise-class database 
> problem - just for the indexes - which are rarely used.  Now, if you 
> talk about the backups themselves - they are much easier to scale.
> 
> 
> -- 
>      Alan Robertson <alanr at unix.sh>
> 
> "Openness is the foundation and preservative of friendship...  Let me 
> claim from you at all times your undisguised opinions." - William 
> Wilberforce
> _______________________________________________
> Web Page:  http://lug.boulder.co.us
> Mailing List: http://lists.lug.boulder.co.us/mailman/listinfo/lug
> Join us on IRC: lug.boulder.co.us port=6667 channel=#colug
> 
>