[lug] File replace thought

Andrew Diederich andrewdied at gmail.com
Tue Sep 18 19:21:52 MDT 2007


On 9/18/07, Hugh Brown <hugh at math.byu.edu> wrote:
> The difficulty with doing a diff here is that there are three-ish
> classes of files: binary files (images, word docs, etc), text/xml files
> dressed up like binary files (e.g. odt that is a zipped archive of
> xml/text) and text files.  I'm going to assume that diff is nearly
> meaningless in the binary case (since size difference is sufficient).
> In the case of the text that looks like binary files, you have to have a
> diff engine that is smart enough to unzip and then do a file format
> aware comparison so that the diff is meaningful (and not a 300 character
> string of xml).  In the case of the text files, I assume that anyone
> working with straight text has had to deliberately do so and will be
> aware of diff.

<snip>

What I was thinking of wouldn't show content -- it'd just show if the
files really were different.  The case I ran into was where I'd copied
a file multiple times off a thumb drive, so that the file times were
different, but the size was exactly the same.  It's theoretically
possible that two text files were the same size, but with different
content and different time stamps.

I suppose there could be a decision tree in the file replace logic.
If the file size and time are the same, run a diff.  If the file size
is the same and the times are different, run a diff.  If the file size
is different and the file time is the same, don't run a diff.

-- 
Andrew



More information about the LUG mailing list