/disjoint-mtl

Research code for "Towards multi-task learning of speech and speaker recognition" at https://arxiv.org/pdf/2302.12773.pdf

Primary LanguagePythonMIT LicenseMIT

Disjoint MTL

Code accompanying paper "Towards multi-task learning for speech and speaker recognition"

See paper_experiments.md for commands to reproduce results.

See here for some model checkpoints.

See here for VoxCeleb 1 and VoxCeleb2 ASR labels with Whisper.

Quick start guide

Copy .env.example to .env and fill accordingly.

See data_utility for instructions for preparing data. See sre2008, hub5_2000, voxceleb and librispeech.

Install dependencies with poetry update.

Run experiments with run_mtl_disjoint.py, run_mtl_joint.py, run_speaker.py and run_speech.py.

Cite

You can cite this work as:

@INPROCEEDINGS{vaessen2023mtl,
  author={Vaessen, Nik and van Leeuwen, David A.},
  booktitle={Interspeech 2023}, 
  title={Towards multi-task learning for speech and speaker recognition}, 
  year={2023},
}