9
Dec/07
4

Howto replace duplicate files with hard links.

Like most people I have multiple backups of the same files, stored in an ad-hoc structure. I went hunting fir a good utility to remove duplicates, and replace them with hard links.

It surprised me that there is a tool for doing this on NTFS volumes under windows. Update: and another free one!

I found a perl script called trimtrees.pl You can find it in CPAN, it’s describes itself such:

Traverse all directories named on the command line, compute MD5
checksums and find files with identical MD5. IF they are equal, do a
real comparison if they are really equal, replace the second of two
files with a hard link to the first one.

Special care is taken to cope with C error conditions.
The inode that is overbooked in such a way, is taken out of the pool
and replaced with the another one such that the minimum of files
needed is kept on disk.

The C< --maxlinks> option can be used to reduce the linkcount on all
files within a tree, thus preparing the tree for a subsequent call to
C. This operation can be thought of the reverse of the normal
trimtrees operation (–maxlinks=1 produces a tree without hard links).

Comments (4) Trackbacks (0)
  1. starseeker
    2:27 pm on February 14th, 2008

    That really hepled :) Thanks. I use this on my podcast directories now, because my original podcatcher is not intelligent enough to do so.

    The Starseeker

  2. Mathew
    4:50 pm on July 16th, 2009

    This sounds like exactly what I am looking for, but CPAN doesn’t have that link. Searching on CPAN was no joy. I did find some scripts using a search engine, I think the second one it the one you used, since the URL looks similar. Looks like CPAN are changing the dir structure. http://cpansearch.perl.org/src/ANDK/Perl-Repository-APC-2.002/eg/trimtrees.pl

  3. asmkrt
    3:04 pm on August 17th, 2009

    i found duplicate finder 2009 the most advance version of duplicate file finder in the market for windows systems…

  4. JClark
    9:40 am on September 16th, 2009

    Yet another freeware tool I found with this feature is Duplicate Cleaner (http://www.digitalvolcano.co.uk/content/duplicate-cleaner).

    P.S.
    The above post by asmkrt is probably spam. “Duplicate Finder 2009″ not freeware.

Leave a comment

No trackbacks yet.

Close
E-mail It
Socialized through Gregarious 42