This is the implementation of the neural networks for electrocardiogram classification proposed in this paper.
The code was tested on TensorFlow 1.0.0/Python 3.5. To install the required python packages, run
pip install -r requirements.txt
For GPU support, use
pip install -r requirements_gpu.txt
Other versions of the packages might work, but were not tested.
The dataset we used was provided for the 2017 PhysioNet/CinC Challenge. To download the data and to have the right folder structure, switch to the data
folder and run the script
./get_data.sh
The framework is designed to experiment with different network architectures. Some code structures might seem complicated, but they help to have a clear organization of training jobs and a detailed log of all results.
The generic architecture of the CNN and the CRNN are defined in codes/network
. The hyperparameters of the model and the training procedure are defined in a json-file in the folder models
. The setups from the paper are predefined, the parameter names should be self-explanatory.
We always used the same split of the dataset, to ensure a fair comparison of different architectures. Other splits of the dataset can be generated with the python script generateSplit.py
. The first two parameters determine the size of the test, validation, and training set. The last parameter is the seed used to generate the random split.
Before starting training, a job has to be defined in the folder jobs
.
Before starting, you have to adapt the file codes/definitions.py
to your environment. The important parameters to set are default_dev
and GPU_devices
(see the file for more detail).
The file jobs/your_machine.json
provided allows to reproduce the 5-fold CV experiments from the paper. All the folds 0,1,2,3,4
have to be activated in cvids
. The proportions of the train/validation/test split is also chosen here (split
). There is an option to turn off the tensorboard-log (log_en
) since it uses a lot of disk space (a simple .csv
-file with the learning curve is always available).
To start this training jobs, change to the codes
folder and run
python train.py
This stores all the jobs (one per model per CV-fold) in a queue, and they get processed by the available GPUs.
If you change the code and want to debug it using the model models/model_name
, you can simply run
python train.py model_name
This starts a single training-job with some fixed parameters.
After the first job is completed, the results can be found in the log
folder. The folders are organized in the format architecture/jobname/fold
. In each such folder the following files can be found:
- The tensorboard log, if it was activated for this job.
- The trained model that achieved the best validation score.
- The hotlog.csv is the simple csv-log.
- A copy of the model and the job that led to those results.
If you want to compare, how well different CNN configurations work, you can run from the code folder
python summarizeScores.py CNN
The same works for the CRNN.
If you find this code useful for your research, please cite
@incollection{zihlmann2017convolutional,
Author = {Zihlmann, Martin and Perekrestenko, Dmytro and Tschannen, Michael},
Booktitle = {Computing in Cardiology (CinC)},
Title = {Convolutional Recurrent Neural Networks for Electrocardiogram Classification},
Year = {2017}}
and acknowledge this repository.