Heart-Disease - Search for risk factors

medical device

Table of content

  1. Introduction
  2. Objectives
  3. Approach
  4. Data
  5. Data preparation
  6. Data Modelling
  7. Evaluation
  8. References and links

Introduction

Data science is at the forefront in medicine. The classical area of heart disease diagnosis is a critical process step. The analysed dataset contains a range of symptoms and as the a result variable if the condition is malignent or benign.

Objectives

The target is to analyse the data and generate a model which predicts based on the input variables if the condition of the patient is malignent or benign. Items to investigate:

  • Age distibution of the patients
  • Chest pain vs. diagnosis - gender comparison
  • Precision of prediction algorithm

Approach

  • Data import
  • Data cleaning and preparation
  • Modelling
  • Evaluation

Data

The Cleveland database was provided by in Kaggle.

Creators:

  • Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D.
  • University Hospital, Zurich, Switzerland: William Steinbrunn, M.D.
  • University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D.
  • V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D. Donor:
  • David W. Aha (aha '@' ics.uci.edu) (714) 856-8779

PARAMETERS

  • age
  • sex
  • chest pain type (4 values)
  • resting blood pressure
  • serum cholestoral in mg/dl
  • fasting blood sugar > 120 mg/dl
  • resting electrocardiographic results (values 0,1,2)
  • maximum heart rate achieved
  • exercise induced angina
  • oldpeak = ST depression induced by exercise relative to rest
  • the slope of the peak exercise ST segment
  • number of major vessels (0-3) colored by flourosopy
  • thal: 3 = normal; 6 = fixed defect; 7 = reversable defect

Data preparation

All details are contained in the notebook ('heard_disease_research.ipynb').

Evaluation

Age distibution of the patients

age distribution

The mean of the patients' age is end 50 with a peak at round about 60.

Chest pain vs. diagnosis - gender comparison

cp_diagnosis_gender

Males (flag 1) show an increase of chest pain (cp) while at the same time the likelyhood of a serious condition drops. For woman (flag 0) is the trend reversed.

Precision of prediction algorithm

conf_matric

The confustion matric of the generated linear regression model shows a prediction presision of 82%.

References and links

Dataset in Kaggle