[lug] Bacula
George Sexton
gsexton at mhsoftware.com
Thu Sep 15 08:44:51 MDT 2005
> -----Original Message-----
> From: lug-bounces at lug.boulder.co.us
> [mailto:lug-bounces at lug.boulder.co.us] On Behalf Of Alan Robertson
> Sent: Wednesday, September 14, 2005 10:50 PM
> To: Boulder (Colorado) Linux Users Group -- General Mailing List
> Subject: Re: [lug] Bacula
>
>
> A simple example:
> If you want to back up 10 million small files, that means
> probably 10 million inserts to a database - which is a HUGE
> amount of work - and incredibly slow when compared to writing a
> 10 million records to a flat file.
> And when you recycle a backup, that means 10 million deletions.
>
Actually, for the single backup case, your argument is sound. For the second
backup, it's wrong. Consider a table structure:
BackedUpFiles
-------------
File_ID
FileName
Size
Owner
Attributes
Date
MD5SUM
Tape
------------
Tape_ID
InitialUseDate
UseCount
Etc.
FilesTape
---------
File_ID
Tape_ID
It becomes pretty trivial to enumerate the tapes a specific file is on. For
the 2nd and subsequent backups, the space usage is vastly more efficient.
I'll give you that removing 10 million entries from FilesTape is a job, but
the advantage of being able to quickly enumerate the tapes a specific file
is on vastly outweigh that issue.
>From a space perspective, the second backup a database file name would use
something like 30MB additional to store the contents while the flat file
would probably use something like:
10^6*(80(avg file name length)+6(owner and attributes)+8(file
size)+8(MD5SUM)+8(file date)+10(delimiters and CRLF)
Or approximately 120MB
George Sexton
MH Software, Inc.
http://www.mhsoftware.com/
Voice: 303 438 9585
More information about the LUG
mailing list