Question : Help with Shell Script: find and move duplicate files in a directory tree

Heya, folks. I've got a directory tree that has 60+ GB of text documents (PDF, CHM, TXT, DOC) in a complex hierarchy. I need a script that will:

A) Run the entire directory tree looking for file names that are duplicates or near-duplicates.
B) Place all the dupes in a separate directory, divided into files A and B.

Find + a nice regular expression would probably do the bulk of the work, but i'm new to scripting and so i don't know if that's the right place to start, nor (provided that it is) even where to go from there. Should i run the entire tree and create a database of every file in the tree, and parse that for duplicates? Or is there another way that i could work the script on the fly?

Any help or suggestions would be appreciated. There aren't any other folks i can turn to for help on this one.

Thanks much --

KiT.

Answer : Help with Shell Script: find and move duplicate files in a directory tree

you identified yourself that you better get a million monkeys to do it manually than a single script ;-)

Anyway, here are a few steps to get you closer:

find . -name \.txt | awk -F/ '{print $NF" "$0};'|sort
find . -name \.txt | awk -F/ '{print $NF" "$0};'|tr '[A-Z]' '[a-z]' |sort
find . -name \.txt | awk -F/ '{print $NF"};'|tr '[A-Z]' '[a-z]' |sort|uniq -d
find . -name \.txt | awk -F/ '{f[$NF]=sprintf("%s %s",f[$NF],$0)}END{for (d in f){print d":"f[d]}}'|sort
find . -name \.txt -ls | awk '{f[$7]=sprintf("%s %s",f[$7],$NF)}END{for (d in f){print d":"f[d]}}' | sort
find . -name \.txt -exec file {} \; | sort +2

Random Solutions

How to access Internet with Site-to-Site VPN established

Import and evaluation

Ports Utilized By GW Agents?

Gmail Expert Needed. . ."Display images below"

I'm having trouble establishing a VPN tunnel into my home network.

Accept Appointment on behalf of another user

Set up Printers on Peer to Peer

cron job to automatically replicate remote mysql server database with local mysql running under mac os x (MAMP)

Windows XP Desktop Lockdown