[lug] File replace thought

Tue Sep 18 18:28:20 MDT 2007

Ben Burdette wrote:
> Lori Reed wrote:
>> Sean Reifschneider wrote:
>>
>>> On Mon, Sep 17, 2007 at 08:18:39PM -0600, Andrew Diederich wrote:
>>>> What I realized is that what I /really/ want to know, is "are the
>>>> files different?"  Does KDE or Gnome do something similar, or with an
>>>> embedded diff call?  The next time someone on the list is designing a
>>>
>>> Have you submitted this feature request to KDE or one of the other 
>>> desktop
>>> environments or file-managers?  If not, you should.  I don't use drag 
>>> and
>>> drop file management, but I think this is a really good idea.
>>
>> I really don't think you'd want this, since it more than doubles the 
>> execution time to copy a file. And if you happen to have two files 
>> with the same name, size, and time stamp, but are in fact different, 
>> the software will still need user input as to what to do with the file 
>> being copied.
>>
>> I considered this problem over 20 years ago and concluded that same 
>> name, same date, and same size equals same file. I haven't, to my 
>> knowledge, been stung yet.
>>
>> And most of the time, thanks to dumb-assed software like ftp, browser 
>> downloads, and the cp option "-p" being not the default, same name and 
>> same size almost always equals same file.
>>
> I think this is true - size and date and name == same file.  That said, 
> I think the diff would be handy when you have the same name but 
> different mod dates or sizes.  Then you get the replace prompt, and on 
> that dialog would be a 'diff' button.  Press that button to see what the 
> differences are in the files, then you can make an informed decision on 
> whether to replace.  This also unobtrusively makes the noob user aware 
> of the existence of a diff utility.
> So if you drop "Latest Dissertation Revision.odt" into a folder with 
> another "Latest Dissertation Revision.odt", you can click diff and 
> hopefully see that the one that is there has the footnotes, while the 
> one you are copying has extra pages, and they need to be merged.
> 

The difficulty with doing a diff here is that there are three-ish 
classes of files: binary files (images, word docs, etc), text/xml files 
dressed up like binary files (e.g. odt that is a zipped archive of 
xml/text) and text files.  I'm going to assume that diff is nearly 
meaningless in the binary case (since size difference is sufficient). 
In the case of the text that looks like binary files, you have to have a 
diff engine that is smart enough to unzip and then do a file format 
aware comparison so that the diff is meaningful (and not a 300 character 
string of xml).  In the case of the text files, I assume that anyone 
working with straight text has had to deliberately do so and will be 
aware of diff.

Writing the format aware diff engine would be a major undertaking, but 
doable.  Would it be worth the effort?  For some yes, for others no.

The one counter-example to my assumption about folks working with 
straight text is people using things like kile to edit tex/latex markup.

Hugh