Those algorithms are for Local-Sensitive Hashing Algorithm and based on UoAuckland COMPSCI 753 course and Stanfrod Uni. Mining of Massive Datasets. They include the following topics:
- Document-word shingle matrix
- Hashing shingle matrix to signature matrix
- Calculate similarity based on the signature matrix. TBC....