Efficient implementation to compute pairs with the lowest levenshtein distance in a list of excel data
How to install:
Install git
Clone repository: git clone https://github.com/austrian-code-wizard/duplicateDetector
Alternatively use the GitHub web GUI to clone the repository
Move into repository: cd duplicateDetector
Make sure you have python 3.7 installed.
Install virtualenv: python3 -m pip install virtualenv
Create venv: python3 -m virtualenv venv
Activate venv: . venv/bin/activate
Install repository: python setup.py install
Run example (make sure you change it to a valid excel file path and fields): python example.py