Fuzzy DupeFinder
[ Deutsche Version ]
Overview
The Fuzzy DupeFinder is a very useful Tool to find and eliminate duplicate files in large collections. Especially in huge MP3 collections, you accumulate duplicate files quickly. Those files use unnecessary disk space. This has to come to an end !
There are already lots of tools around to tackle this problem.

So why another tool?

Most tools around use checksums to compare file contents or compare filenames byte by byte. Especially with multimedia files you can have the same content (video, music, images) encoded differently. Either with differing codecs or differing bit rates. Sometimes they are just tagged differently. These files are usually not recognized as duplicates.

This is where the Fuzzy DupeFinder comes into play. It compares file names by utilizing a fuzzy logic.

Example:
01 - Artist - Song.mp vs. Song - Artist.mp3


A simple file name comparision would not result in a match. The Fuzzy DupeFinder finds these duplicates.

System requirements
  • Windows 2000, XP, Vista
  • .NET Framework (Download here)

  • How to install
  • Step 1: Download the zipped install kit. (Demo version - what's limited ?)
  • Step 2: Extract the archive into a new directory
  • Step 3: Run DupeFinder.exe. No installation necessary.

  • Buy the full version now !
    You can purchase the full version via PayPal for 7 US $ or 5 Euros. After a purchase, you can download the full version in the download area.Please make sure that your PayPal email address is still valid !. You can learn more about the differences between the demo and the full version here.

    7 US Dollars
    5 Euros

    Screenshots
    Click for large image

    How to use
    Search options

    First you should provide a directory to search in. All files in this directory and its sub directories are compared against each other.
    Often it is reasonable to restrict the search to a certain file type. You can enter a file extension to achieve that. Beginning with version 1.2 you can also enter multiple file types separated by a semicolon: mp3;ogg for example.

    To exclude files that add less value in the comparison (like Track1, Track2,..) you can enter a text pattern in the according text box. "Track" would exclude Track1 and Track2 and so on. You can also enter multiple patterns separated by semicolon.

    Comparison options

    Here you can adjust some settings to influence the accuracy of the search algorithm. Are all options deselected the file names are not altered before the comparison. However usually it makes sense to normalize them in order to improve the accuracy.

  • Remove spaces: Before the comparison the spaces in the file name are removed
  • Case sensitive: The comparison respects the case
  • Regular expression: This is the most powerful (and most resource intensive) option. Here a regular expression is evaluated before the comparison (see more on regular expressions here). With this method you can e.g. remove non-significant characters like _ & (). This can improve the accuracy dramatically. For example the expression [^a-zA-Z] filters all character except the 26 letters (upper and lower case).
  • Threshold: A match above this percentage level is considered a duplicate. 70 is usually a good value.

    File pools

    You have basically two options to find duplicates:
  • Search within a single file pool. With this method you can select a single set of directories. All files in that set are combined to a single list. Every file on that list is compared to every other file on that list.
  • Search within two file pools. With this method you can select two sets of directories. All files of each set is combined to a list. So you have two lists, one for each directory set. Now every file on list A is compare to every file on list B. This option is especially useful to integrate some new files in to an existing collection.

    Result window

    After a search pass you'll find the results listed here. The left and the right column contains information about the files, the middle column contains the match.
    If you click on a file it is marked for deletion. A double-click previews the file.

  • Demo version vs. full version
    The demo version is exactly the same program except these two limitations:
  • You cannot delete duplicate files directly from the user interface.
  • You cannot preview files from the user interface.
    The search engine to find duplicate files in the first place is absolutely identical and not limited in any way. The demo version is not time limited in any way.

  • Sitemap
    News
    Fuzzy DupeFinder
    Infrared Dialer
    Download
    Contact and Feedback
    Disclaimer