/SelfDistill-SER

Primary LanguagePythonMIT LicenseMIT

Fast yet effective speech emotion recognition with self-distillation

arXiv License: MIT

SelfDistill-SER

This is a Python and PyTorch code for the self-distillation framework in our paper:

Zhao Ren, Thanh Tam Nguyen, Yi Chang, and Björn W. Schuller. Fast yet effective speech emotion recognition with self-distillation. ICASSP, 2023.

Citation

@booktitle{ren2022fast,
      title={Fast yet effective speech emotion recognition with self-distillation}, 
      author={Zhao Ren and Thanh Tam Nguyen and Yi Chang and Björn W. Schuller},
      year={2023},
      booktitle={ICASSP},
      note={5 pages}
}

Abstract

In this paper, self-distillation was applied to produce a fast and effective SER model, by simultaneously fine-tuning wav2vec 2.0 and training its shallower versions.

Config

All of the paths can be set in the runme.sh file.

Experiments Running

Preprocessing: main/preprocess.py

Model training: main/main_pytorch.py

Both python files can be run via

sh runme.sh