Dec/074
Howto replace duplicate files with hard links.
Like most people I have multiple backups of the same files, stored in an ad-hoc structure. I went hunting fir a good utility to remove duplicates, and replace them with hard links.
It surprised me that there is a tool for doing this on NTFS volumes under windows. Update: and another free one!
I found a perl script called trimtrees.pl You can find it in CPAN, it’s describes itself such:
Traverse all directories named on the command line, compute MD5
checksums and find files with identical MD5. IF they are equal, do a
real comparison if they are really equal, replace the second of two
files with a hard link to the first one.Special care is taken to cope with C
error conditions.
The inode that is overbooked in such a way, is taken out of the pool
and replaced with the another one such that the minimum of files
needed is kept on disk.The C< --maxlinks> option can be used to reduce the linkcount on all
files within a tree, thus preparing the tree for a subsequent call to
C. This operation can be thought of the reverse of the normal
trimtrees operation (–maxlinks=1 produces a tree without hard links).
Enjoy this article?
Leave a comment
No trackbacks yet.
2:27 pm on February 14th, 2008
That really hepled
Thanks. I use this on my podcast directories now, because my original podcatcher is not intelligent enough to do so.
The Starseeker
4:50 pm on July 16th, 2009
This sounds like exactly what I am looking for, but CPAN doesn’t have that link. Searching on CPAN was no joy. I did find some scripts using a search engine, I think the second one it the one you used, since the URL looks similar. Looks like CPAN are changing the dir structure. http://cpansearch.perl.org/src/ANDK/Perl-Repository-APC-2.002/eg/trimtrees.pl
3:04 pm on August 17th, 2009
i found duplicate finder 2009 the most advance version of duplicate file finder in the market for windows systems…
9:40 am on September 16th, 2009
Yet another freeware tool I found with this feature is Duplicate Cleaner (http://www.digitalvolcano.co.uk/content/duplicate-cleaner).
P.S.
The above post by asmkrt is probably spam. “Duplicate Finder 2009″ not freeware.