/Similarity-Search-and-Join

similarity join and search algorithms for edit distance and jaccard

Primary LanguageC++BSD 2-Clause "Simplified" LicenseBSD-2-Clause

How to build:

Prerequisite

  1. g++ >= 4.8
  2. boost >= 1.5
  3. gnu make

Then just run 'make'!

How to run:

  1. For ed, the command is './PATH_TO_EXECUTABLE data_file_name threshold q'
  2. For token-based metrics, the command is './PATH_TO_EXECUTABLE metric data_file_name threshold'. metric should be one of 'jaccard', 'cosine' or 'dice'