- Built fuzzy merging algorithms with edit-distance constraints to find similar string pairs for data cleaning.
- Implemented the Pass-Join algorithm to perform string similarity joins using C++ and Python.
- Improved the classic fuzzy merging algorithm by 90% in speed.
sapphire921/Fuzzy_Merge
Use C++ and Python to implement the passjoin algorithm :crossed_swords:
Jupyter Notebook