Need: - Python 2.7 - numpy - scikit-learn - matplotlib Commands: For properties of LSH based clustering, run: python n_hashes_vs_counts_experiments.py For comparing approximation results with k-means++, run: python plsh_vs_kmeans_comp.py For other experiments, run: python plsh_experiments.py
greatwallisme/DistClust_via_LSH_L2
This repository contains the experiments conducted in the paper: "Distributed Clustering via LSH Based Data Partitioning" (ICML 2018) with synthetic data. This implementation is not a distributed implementation. It is a single machine implementation intended to demonstrate the properties of this technique and approximation results.
Python