/KNUST-BreastCancer-Prediction

This repository contains the resources and codebase for a research project aimed at predicting breast cancer cases using data from the KNUST hospital.

Primary LanguageJupyter Notebook

Breast Cancer Prediction at Oforikrom Municipality, Ghana


This repository contains the resources and codebase for a research project aimed at predicting breast cancer cases using data from the Oforikrom Municipality, Ghana.

Overview

Breast cancer stands as one of the most diagnosed cancers worldwide. Early detection can significantly increase the chances of successful treatment. This project utilizes patient data from KNUST hospital to build a predictive model, shedding light on the importance of data-driven decision-making in healthcare.

Project Structure

  • Data Collection: Original dataset obtained from the KNUST hospital survey.
  • Data Cleaning & Wrangling: Scripts to preprocess, clean, and format the data for analysis.
  • Data Visualization: Visual representations of patterns, correlations, and insights.
  • Feature Selection: Implementing the Chi-Square approach to select influential features.
  • Model Building & Tuning: Scripts and notebooks detailing the Random Forest model creation, hyperparameter tuning, and variance-bias trade-off analysis.
  • Model Evaluation: Resources related to the model's performance evaluation using various metrics.
  • Recommendations & Insights: Derived learnings and areas of potential improvement.

Key Findings

  • Achieved a model accuracy of 90.57%.
  • Precision, Recall, and F1-Score stood at 89.27%, 93.37%, and 91.27% respectively.
  • A set of key features, such as frequency of diagnostics and personal screening history, played pivotal roles in predictions.

Usage

  1. Clone this repository.
    git clone https://github.com/DavidNart90/KNUST-BreastCancer-Prediction.git
    
  2. Navigate to the directory.
    cd [directory-name]
    
  3. Ensure you have the required libraries installed.
    pip install -r requirements.txt
    
  4. Run the respective scripts or Jupyter notebooks to replicate the results.

Contributions

This project is the result of collaborative research. Contributions are welcome. Please ensure to follow the existing code structure and formatting.

License

This project is licensed under the MIT License. Please refer to the LICENSE file for details.

Acknowledgments

A special thanks to KNUST hospital for providing the data and supporting this research initiative.