ECGXtractor is a Python library that allows to investigate the topic of ECG biometric recognition. In particular, it executes the experiments described in the article specified at the end of this page.
Melzi, Pietro, Ruben Tolosana, and Ruben Vera-Rodriguez. "Ecg biometric recognition: Review, system proposal, and benchmark evaluation." IEEE Access, 2023. Link
In the file requirements.txt you can find the list of dependencies required by this library. Run the following commands in Windows to set up a suitable virtual environment (replace <env> with the mane of the environment). pip and virtualenv libraries are required to run the following commands.
python -m venv <env>
.\<env>\Scripts\activate
python -m pip install -r requirements.txt
In this folder you can find the list of genuine and impostor comparison pairs evaluated in the different executions of the experiments included in our paper, related to the task of verification.
We also provide here the weights of our pretrained models (saved.rar). They are used to perform the experiments described in our paper.
The following instructions refer to PTB database. Similarly, you can also run experiments with ECG-ID and CYBHI databases.
Download here ptb.rar and extract it in datasets\ptb. It contains:
- 549 12-Lead ECG signals from PTB database, re-sampled at 500 Hz, and filtered
- the list of healthy subjects included in the PTB database;
- the list of subjects for which multiple ECG signals are contained in PTB;
- for each ECG signal, the list of time instants corresponding to r-peaks.
Run the file build_segments.py after changing in the code the parameter consecutive_heartbeats to:
- -1 to generate a template from each signal;
- 1 to extract all the single heartbeats obtainable from each signal;
- n > 1 to generate summary samples from each signal, with n the number of consecutive single heartbeats considered for the summary sample. To reproduce experiments of the article, set n = 10.
python src\ptb\build_segments_ptb.py
In folder datasets, we provide the dataset files used in the different experiments involving PTB:
- single-session verification, with the set of 52 healthy subjects: test_healthy_single_verification;
- single-session verification, with the set of 113 subjects: test_multi_single_verification;
- multi-session verification, with the set of 113 subjects provided with multiple ECG signals: test_multi_multi_verification;
- single-session identification, with the set of 52 healthy subjects: train_healthy_single_identification, val_healthy_single_identification, test_healthy_single_identification;
- single-session identification, with the set of 113 subjects: train_multi_single_identification, val_multi_single_identification, test_multi_single_identification;
- multi-session identification, with the set of 113 subjects provided with multiple ECG signals: train_multi_multi_identification, val_multi_multi_identification, test_multi_multi_identification.
These files are obtained by running create_dataset_ptb.py.
Important: the datasets files contain the list of files considered in the experiment. The specific samples used to train and evaluate the system will be generated during execution, with functions contained in the source code.
In folder settings, we provide the files containing the experimental settings considered when training the different models. As Autoencoder and verification networks are trained with the in-house database, which is not available, some fields of the json files are left empty (i.e., data_path, train, val).
In config_verification you can change the following parameters before running experiments for the verification task:
- lead_i: true if only Lead I is considered, false if all the 12 Leads are considered.
- positive_samples: specify the number of genuine comparisons generated from each subject.
- negative_multiplier: the product positive_samples * negative_multiplier represents the number of impostor comparisons whose enrolment sample belongs to the same subject.
- data_path and test: by joining them you obtain the path of the dataset file considered for evaluation.
For the identification task, config_identification allows to change the following parameters, in addition to the previously described ones:
- initial_weights: specify the weight of the bottom layers of the identification model (not trainable).
- individuals: specify the output dimension of the classifier.
- resample_max: true to oversample, false to undersample. It is used in case the number of samples belonging to each subject is not equal in the training and validation datasets.
Important: some parameters, i.e., lead_i and initial_weight must be coherent with the dataset and the network architecture selected, otherwise an error message will appear during the execution.
To evaluate previously trained models for ECG biometric verification (or models trained for identification), run the following command:
python src\predict.py settings\config_verificaton.json --model_path <path of the saved model>
or
python src\predict.py settings\config_identification.json --model_path <path of the saved model>
Additionally, if you want to train your model with your database and settings, you can run the following command:
python src\train.py settings\config_identification.json
After each training experiment, the folder saved will be created. It will contain the weights of your trained models.
Melzi, Pietro, Ruben Tolosana, and Ruben Vera-Rodriguez. "Ecg biometric recognition: Review, system proposal, and benchmark evaluation." IEEE Access, 2023. Link
Please remember to reference the article on any work made public, whatever the form, based directly or indirectly on any part of ECGXtractor.
For further questions, please send an email to pietro.melzi@uam.es