/Tanzania_Water_Wells

A classifier to predict the condition of a water well

Primary LanguageJupyter Notebook

Tanzania_Water_Wells

Authors: Medrine Waeni, Felix Nyagah, Samson Kamunyu, Brian Magembe, Wendy Mwiti

Project goal

The government of Tanzania to adopt the model developed in order to improve on the general maintenance operations on the water points to ultimately meet the water needs of its citizens.

Column names

  1. date_recorded - The date the row was entered

  2. funder - Who funded the well

  3. gps_height - Altitude of the well

  4. installer - Organization that installed the well

  5. longitude - GPS coordinate

  6. latitude - GPS coordinate

  7. wpt_name - Name of the waterpoint if there is one

  8. num_private - Number of private waterpoints

  9. basin - Geographic water basin

  10. subvillage - Geographic location

  11. region - Geographic location

  12. region_code - Geographic location

  13. district_code - Geographic loc

  14. lga - Geographic location

  15. ward - Geographic location

  16. population - Population around the well

  17. public_meeting - True/False

  18. recorded_by - Group entering this row of data

  19. scheme_management - Who operates the waterpoint

  20. scheme_name - Who operates the waterpoint

  21. permit - If the waterpoint is permitted

  22. construction_year - Year the waterpoint was constructed

  23. extraction_type - The kind of extraction the waterpoint uses

  24. extraction_type_group - The kind of extraction the waterpoint uses

25 . status_group - Condition of the well

Findings

The Random Forest algorithm, having the highest precision score of all performed better than the other models and shall be used as the final model The precision score of the model was 66% which means that it was able to precisely determine the status of the waterpoint 66% of the time

Recommendations

  • The government should prioritize drawing water from springs when building the waterpoints and should not draw water sourced from shallow wells or boreholes as they spoil the pumps quicker

  • Water points with enough water should be closely monitored, as the high use could lead to their failure.