Stroke_Prediction

Overview

This repository contains the code for building and evaluating several machine-learning models to predict the likelihood of a stroke based on health data from patients. The dataset used includes various medical attributes such as gender, age, hypertension status, heart disease status, and more.

Data Description

Kaggle Link: https://www.kaggle.com/datasets/fedesoriano/stroke-prediction-dataset

The dataset includes the following attributes:

Gender
Age
Hypertension
Heart disease
Ever married
Work type
Residence type
Average glucose level
BMI
Smoking status
Stroke (Target Variable)

Installation and Usage

To set up the project environment:

Clone the repository: https://github.com/jayasurya247/Stroke_Prediction.git
Run the Python script or Jupyter Notebook.

Models Implemented

Random Forest Classifier
K-Nearest Neighbors (KNN)
Support Vector Machine (SVM)
Decision Tree Classifier
XGBoost Classifier

Evaluation

The models were evaluated based on their accuracy, precision, recall, and F1-score using both original and SMOTE (Synthetic Minority Over-sampling Technique) enhanced datasets to handle class imbalance.

Results

The models' performance can be found in the notebooks, showcasing detailed classification reports and accuracy comparisons.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributors

Jaya Surya Thota

Feel free to fork this project and contribute to improving the stroke prediction models.

jayasurya247/Stroke_Prediction