Heart Failure Prediction

  • A logistic Regression model using sklearn
  • Predicts whether the person's heart has failed or not using his/her clinical reports during the follow up period

Description of the dataset:

Feature Explanation
Age Age of the patient
Anaemia Decrease of red blood cells or hemoglobin
High blood pressure If a patient has hypertension
Creatinine phosphokinase (CPK) Level of the CPK enzyme in the blood
Diabetes If the patient has diabetes
Ejection fraction Percentage of blood leaving
Sex Woman or man
Platelets Platelets in the blood
Serum creatinine Level of creatinine in the blood
Serum sodium Level of sodium in the blood
Smoking If the patient smokes
Time Follow-up period
DEATH_EVENT If the patient died during the follow-up period
The dataset can be found on kaggle.

Exploratory Data Analysis

  • Binary Data pie plots:
fixed pie

-About 31% patients' heart failed during the follow up period.
-The dataset doesn't seem to be balanced between male and female patients.

  • Distribution plots:
logistic histograms
The disctribution plots shows where the features are most populated at.

Conclusions of the Logistic Regression Model

  • Contributions of the features:
logistic Contributions
Sex, smoking, platelets, high blood pressure, diabetes, creatinine phosphokinase and anaemia doesnt contribute much as compared to other Features to the model.
  • Accuracy on train and test data:
logistic accuracy
The model seems to be good enough with predictions.
  • Confusion Matrix:
logistic confusion matrix