This repository is home to the Disease Prediction Machine Learning Project, designed to utilize symptom data to accurately predict diseases. By applying sophisticated machine learning algorithms, this project aims to aid in early diagnosis and improve healthcare outcomes.
Data Preprocessing Excellence: We employed advanced preprocessing techniques, such as feature scaling and encoding, to ensure high-quality data input for model training.
Diverse Algorithm Implementation: The project harnesses a variety of machine learning algorithms, including Decision Trees, Support Vector Machines (SVM), Random Forests, and XGBoost, to predict diseases from symptoms with high accuracy.
Model Accuracy and Performance: Our models have been fine-tuned to achieve significant accuracy, with the XGBoost model achieving an impressive 89% accuracy rate, offering reliable predictions that can support medical professionals.
Real-world Application: The insights and predictive power provided by this project have the potential to be incorporated into healthcare systems for enhanced diagnostic support.
- Jupyter Notebook or PyCharm for coding and experimentation.
- Python with libraries such as Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, SciPy, itertools, graphviz, xgboost, and plotly.
- Clone the repository.
- Install the required Python libraries as listed above.
- Navigate to the
src
directory to find the source code and Jupyter notebooks. - Explore the
data
directory to review the datasets used for model training and testing. - Execute the Jupyter notebooks to see the step-by-step process of data preprocessing, model training, and evaluation.
- You are encouraged to contribute to the project, provide feedback, or use the project for educational purposes.