/Akin

Python library for detecting near duplicate texts in a corpus at scale using Locality Sensitive Hashing, as described in chapter three of Mining Massive Datasets.

Primary LanguagePythonMIT LicenseMIT

Issues