/parallel_k_means

Coursework for Large Scale Data Methods

Primary LanguageJupyter Notebook

Parallelized K-means Clustering A script that implements a parallized k-means clustering algorithm using Python's multiprocessing module.

Data The dataset is an altered and simplified version combined Home Mortgage Disclosure Act and Census ACS data from an Urban Institute project: https://adrf.urban.org/