/Fuzzy_Merge

Use C++ and Python to implement the passjoin algorithm :crossed_swords:

Primary LanguageJupyter Notebook

FuzzyMerge

  1. Built fuzzy merging algorithms with edit-distance constraints to find similar string pairs for data cleaning.
  2. Implemented the Pass-Join algorithm to perform string similarity joins using C++ and Python.
  3. Improved the classic fuzzy merging algorithm by 90% in speed.