/HST

Official implementation of Hierarchical Spectrogram Transformers (HST)

Primary LanguagePython

HST: Hierarchical Spectrogram Transformers

This repository contains the official implementation of Hierarchical Spectrogram Transformers (HST) described in the following paper:

Aytekin, I., Dalmaz, O., Gonc, K., Ankishan, H., Saritas, E.U., Bagci, U., Celik, H., & Çukur, T. (2022). COVID-19 Detection from Respiratory Sounds with Hierarchical Spectrogram Transformers. ArXiv, abs/2207.09529.

Dependencies

python>=3.6.9
torch>=1.7.0
torchvision>=0.8.1
librosa
cuda=>11.3

Download pre-trained HST models

The following links contain pre-trained HST model weights on ImageNet:

After downloading the weights, please align them as HST/model/imagenet_weights/hst_base_imagenet.pth for a smooth process.

Dataset

The dataset in the paper Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data is used in this work. Their dataset is not publicly available but can be released for research purposes as said here.

For Task 1,

  • covid: covidandroidnocough + covidandroidwithcough + covidwebnocough + covidwebwithcough
  • healthy: healthyandroidnosymp + healthywebnosymp

For Task 2,

  • covid: covidandroidwithcough + covidwebwithcough
  • healthy: healthyandroidwithcough + healthywebwithcough

The audio files in the folders mentioned above are converted to spectrograms by wave2spectrogram.py. Then, the dataset should be aligned as:

/data/
  ├── task1_cough
  ├── task1_breath
  ├── task2_cough
  ├── task2_breath  
 
/data/task1_cough/
  ├── train_test
  ├── val  
  
/data/task1_cough/train_test
  ├── covid
  ├── healthy

Train and test

To train and test the chosen model with the determined seed, follow:

cd HST
python3 train.py --dataset "/data/task1_cough/train_test"  --model "hst_base"  --pretrained True  --seed 1

In our paper, HST is trained with 10 different seed for 10-fold like cross-validation. The results are averaged and reported in the paper.

Demo

An audio file of a respiratory sound can be tested with demo.py. The HST-Base model trained with task 2 cough modality data with seed 1 can be downloaded from this link.

python3 demo.py --audio_path "sample_resp_sound"

Result is printed as "healthy" or "covid".

Citation

You are encouraged to modify/distribute this code. However, please acknowledge this code and cite the paper appropriately.

@misc{hst,
  doi = {10.48550/ARXIV.2207.09529},
  
  url = {https://arxiv.org/abs/2207.09529},
  
  author = {Aytekin, Idil and Dalmaz, Onat and Gonc, Kaan and Ankishan, Haydar and Saritas, Emine U and Bagci, Ulas and Celik, Haydar and Cukur, Tolga},
  
  keywords = {Sound (cs.SD), Machine Learning (cs.LG), Audio and Speech Processing (eess.AS), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering},
  
  title = {COVID-19 Detection from Respiratory Sounds with Hierarchical Spectrogram Transformers},
  
  publisher = {arXiv},
  
  year = {2022},
  
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Acknowledgements

This code uses libraries from covid19-sounds-kdd20.

For questions and comments, please contact me: aytekinayceidil@gmail.com