| Overview |
The Fuzzy DupeFinder is a very useful Tool to find and eliminate duplicate files in large collections.
Especially in huge MP3 collections, you accumulate duplicate files quickly. Those files use unnecessary
disk space. This has to come to an end !
There are already lots of tools around to tackle this problem.
So why another tool?
Most tools around use checksums to compare file contents or compare filenames byte by byte. Especially
with multimedia files you can have the same content (video, music, images) encoded differently. Either
with differing codecs or differing bit rates. Sometimes they are just tagged differently.
These files are usually not recognized as duplicates.
This is where the Fuzzy DupeFinder comes into play. It compares file names by utilizing a fuzzy logic.
Example:
01 - Artist - Song.mp vs. Song - Artist.mp3
A simple file name comparision would not result in a match. The Fuzzy DupeFinder finds these duplicates.
|
| System requirements |
|
Windows 2000, XP, Vista
.NET Framework (Download here)
|
| How to install |
|
Step 1: Download the zipped install kit. (Demo version - what's limited ?)
Step 2: Extract the archive into a new directory
Step 3: Run DupeFinder.exe. No installation necessary.
|
| Buy the full version now ! |
You can purchase the full version via PayPal for 7 US $ or 5 Euros. After a purchase, you can download the full version
in the download area.Please make sure that your PayPal email address is still valid !.
You can learn more about the differences between the demo and the full version here.
|
| Screenshots |
|
| How to use |
Search options
First you should provide a directory to search in. All files in this directory and its sub directories
are compared against each other.
Often it is reasonable to restrict the search to a certain file type. You can enter a file extension
to achieve that. Beginning with version 1.2 you can also enter multiple file types separated by a semicolon:
mp3;ogg for example.
To exclude files that add less value in the comparison (like Track1, Track2,..) you can enter a text pattern in
the according text box. "Track" would exclude Track1 and Track2 and so on. You can also enter multiple
patterns separated by semicolon.
Comparison options
Here you can adjust some settings to influence the accuracy of the search algorithm.
Are all options deselected the file names are not altered before the comparison. However usually
it makes sense to normalize them in order to improve the accuracy.
Remove spaces: Before the comparison the spaces in the file name are removed
Case sensitive: The comparison respects the case
Regular expression: This is the most powerful (and most resource intensive) option. Here a regular
expression is evaluated before the comparison (see more on regular expressions here).
With this method you can e.g. remove non-significant characters like _ & (). This can improve the accuracy dramatically.
For example the expression [^a-zA-Z] filters all character except the 26 letters (upper and lower case).
Threshold: A match above this percentage level is considered a duplicate. 70 is usually a good value.
File pools
You have basically two options to find duplicates:
Search within a single file pool. With this method you can select a single set of directories. All files in that set
are combined to a single list. Every file on that list is compared to every other file on that list.
Search within two file pools. With this method you can select two sets of directories. All files of each set is combined
to a list. So you have two lists, one for each directory set. Now every file on list A is compare to every file on list B.
This option is especially useful to integrate some new files in to an existing collection.
Result window
After a search pass you'll find the results listed here. The left and the right column contains information about
the files, the middle column contains the match.
If you click on a file it is marked for deletion. A double-click previews the file.
|
| Demo version vs. full version |
The demo version is exactly the same program except these two limitations:
You cannot delete duplicate files directly from the user interface.
You cannot preview files from the user interface.
The search engine to find duplicate files in the first place is absolutely identical and not limited in any way.
The demo version is not time limited in any way.
|
|
|