/Tanzania-water-table

Pump it Up: Data Mining the Water Table | HOSTED BY DRIVENDATA

Primary LanguageJupyter Notebook

Tanzania-water-table

Am using data sets provided by drivendata.org to predict if water pumps will be functional, need repair, or non-functional.

Data-explore shows my preliminary analysis of the data and some nice countplots in seaborn

Clean-data shows how I removed features and created new ones for operation year, season, and rural/urban setting

Train-test-split shows accuracy calculations, log loss, classification report, and confusion matrix

Model-and-predict-data shows model using random forest classifier. Using rfc currently have a score of 0.8020 which ranks me at 176 out of 1553 competitors.

Map of data using cartodb: https://jdills26.cartodb.com/viz/fceaae6e-0f04-11e6-ba94-0ef24382571b/public_map