/scalable-ML

In this repo, I build a LogisticRegression prediction model with Dask and PySpark and initialize an AWS EMR cluster to run the entire pipeline.

Primary LanguagePython

Scaleable-Ml

This project predicts the deliquency rate in the Freddie Mac dataset in a distributed way using Dask and PySpark.