Dec/071
Howto replace duplicate files with hard links.
Like most people I have multiple backups of the same files, stored in an ad-hoc structure. I went hunting fir a good utility to remove duplicates, and replace them with hard links.
It surprised me that there is a tool for doing this on NTFS volumes under windows. Update: and another free one!
I found a perl script called trimtrees.pl You can find it in CPAN, it’s describes itself such:
Traverse all directories named on the command line, compute MD5
checksums and find files with identical MD5. IF they are equal, do a
real comparison if they are really equal, replace the second of two
files with a hard link to the first one.Special care is taken to cope with C
error conditions.
The inode that is overbooked in such a way, is taken out of the pool
and replaced with the another one such that the minimum of files
needed is kept on disk.The C< --maxlinks> option can be used to reduce the linkcount on all
files within a tree, thus preparing the tree for a subsequent call to
C. This operation can be thought of the reverse of the normal
trimtrees operation (–maxlinks=1 produces a tree without hard links).
Leave a comment
No trackbacks yet.
2:27 pm on February 14th, 2008
That really hepled
Thanks. I use this on my podcast directories now, because my original podcatcher is not intelligent enough to do so.
The Starseeker
[Reply]