/KalmanFilter-1004

Data Polygamy Project for DSGA-1004

Primary LanguageJupyter Notebook

data - the four .tsv files we have tested our algorithms with
images - the outlier factors plots generated through our testing
proposal - project proposal

prepRDD.py - functions to find and remove missing data
GWE_pyspark.py - pyspark implementation of GWE
GWE_local.ipynb - a local python and Pandas implementation of GWE with testing done with our local data
outliers - Pyspark implementation of K-modes with resutls on the given datafiles from DUMBO (NYU HPC server)