/Heart-Arrhythmia-Classification

This program takes and input of an ECG in European Data Format (EDF) and outputs the classification for heartbeats into normal vs different types of arrhythmia . It uses a deep learning model for classification purposes.

Primary LanguageJupyter Notebook

Heart-Arrhythmia-Classification



Instructions to run

  1. Note down the location of the ".edf" file and enter it into the cmd line.

  2. Use the command where

    • PATH_TO_EDF_FILE - Location of your EDF file
    • SAVE_FILE_NAME - Name of the text and npy file which will be generated
    • MODEL_NUMBER -
      • 0 - Model from the paper
      • 1 - Simplified CNN model
   python predict.py PATH_TO_EDF_FILE SAVE_FILE_NAME MODEL_NUMBER

For example, the command below will generate a file called "ECG_100.txt" which will contain the classifications

    python predict.py ./files/100.edf ECG_100 


Dataset

The original datasets used are the MIT-BIH Arrhythmia Dataset and that are preprocessed based on the methodology described in the paper below in order to end up with samples of a single heartbeat each and normalized amplitudes.

Kachuee, M., Fazeli, S., & Sarrafzadeh, M. (2018). ECG Heartbeat Classification: A Deep Transferable Representation. 2018 IEEE International Conference on Healthcare Informatics (ICHI). https://doi.org/10.1109/ichi.2018.00092 (https://arxiv.org/pdf/1805.00794.pdf)


The process followed is:

  1. Splitting the continuous ECG signal to 10s windows and select a 10s window from an ECG signal.
  2. Normalizing the amplitude values to the range of between zero and one.
  3. Finding the set of all local maximums based on zerocrossings of the first derivative.
  4. Finding the set of ECG R-peak candidates by applying a threshold of 0.9 on the normalized value of the local maximums.
  5. Finding the median of R-R time intervals as the nominal heartbeat period of that window (T).
  6. For each R-peak, selecting a signal part with the length equal to 1.2T.
  7. Padding each selected part with zeros to make its length equal to a predefined fixed length.

MIT-BIH Arrhythmia dataset :

  • Number of Categories: 5
  • Number of Samples: 109446
  • Sampling Frequency: 125Hz
  • Data Source: Physionet’s MIT-BIH Arrhythmia Dataset
  • Classes: [’N’: 0, ‘S’: 1, ‘V’: 2, ‘F’: 3, ‘Q’: 4]

Class distribution in the dataset

  • Before Resampling


  • After Resampling




Model and Results


A. Paper based model


Figure 1: Model Structure



Figure 2: Accuracy and Loss Plot




Figure 3: Confusion Matrix




Figure 4: Classification Report




B. Simplified model




Figure 5: Simplified Model Structure



Figure 6: Accuracy and Loss Plot




Figure 7: Confusion Matrix




Figure 8: Classification Report