Water-Quality-Prediction-Model Project

Overview

This repository contains code and resources for a machine learning project focused on predicting water quality based on various physiochemical properties. The goal is to develop a system for assessing the potability of water samples, crucial for public health and safety.

Dataset

The dataset consists of physiochemical properties of water samples, labeled with potability status. It serves as a valuable resource for training and evaluating machine learning models.

Model Building

Several machine learning models were trained and evaluated, including decision trees, K-nearest neighbors, logistic regression, random forests, XGBoost, Gaussian naive Bayes, support vector machines, and AdaBoost. Models were optimized using techniques like hyperparameter tuning and cross-validation.

Results

The XGBoost classifier emerged as the best-performing model, achieving the highest accuracy in predicting water potability.

Repository Contents

  • Code: Google Colab files (.ipynb) for data preprocessing, analysis, model training, and evaluation.
  • Datasets: Water quality dataset.
  • Report: Detailed report summarizing the project.