Find all duplicate files in current and sub-directories with bash.
find -not -empty -type f -printf '%s\n' | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate
Breakdown ๐
- Find all non empty files and print out size.
- Do a numeric sort on size list.
- Print out only duplicate sizes.
- One at a time run find on size and print file names.
- Find md5sum of all files.
- Alphabetical sort md5sums and file names.
- Find all md5sums which repeats and print them in groups.
Alternatively ๐
Or do it the easy way and install a tool for finding duplicates files. This tool is much faster than the oneliner above.
apt-get install fdupes
This does more or less the same thing as the oneliner.
fdupes -r .