Question : Help with Shell Script:  find and move duplicate files in a directory tree

Heya, folks.  I've got a directory tree that has 60+ GB of text documents (PDF, CHM, TXT, DOC) in a complex hierarchy.  I need a script that will:

A) Run the entire directory tree looking for file names that are duplicates or near-duplicates.
B) Place all the dupes in a separate directory, divided into files A and B.

Find + a nice regular expression would probably do the bulk of the work, but i'm new to scripting and so i don't know if that's the right place to start, nor (provided that it is) even where to go from there.  Should i run the entire tree and create a database of every file in the tree, and parse that for duplicates?  Or is there another way that i could work the script on the fly?

Any help or suggestions would be appreciated.  There aren't any other folks i can turn to for help on this one.

Thanks much --

KiT.

Answer : Help with Shell Script:  find and move duplicate files in a directory tree

you identified yourself that you better get a million monkeys to do it manually than a single script ;-)

Anyway, here are a few steps to get you closer:

find . -name \*.txt | awk -F/ '{print $NF" "$0};'|sort
find . -name \*.txt | awk -F/ '{print $NF" "$0};'|tr '[A-Z]' '[a-z]' |sort
find . -name \*.txt | awk -F/ '{print $NF"};'|tr '[A-Z]' '[a-z]' |sort|uniq -d
find . -name \*.txt | awk -F/ '{f[$NF]=sprintf("%s %s",f[$NF],$0)}END{for (d in f){print d":"f[d]}}'|sort
find . -name \*.txt -ls | awk '{f[$7]=sprintf("%s %s",f[$7],$NF)}END{for (d in f){print d":"f[d]}}' | sort
find . -name \*.txt -exec file {} \; | sort +2
Random Solutions  
 
programming4us programming4us