FastAudio is a Learnable Audio Frontend team Magnum's designed for the ASVspoof 2021 challenge. It was developed using the Speechbrain framework. The solution was produced by Quchen Fu and Zhongwei Teng, researchers in the Magnum Research Group at Vanderbilt University. The Magnum Research Group is part of the Institute for Software Integrated Systems.
The ASVspoof 2021 Competition challenges teams to develop countermeasures capable of discriminating between bona fide and spoofed or deepfake speech. The model achieved a 0.2531 min t-DCF score in LA Track on the open Leaderboard.
Show details
- speechbrain==0.5.7
- pandas
- wandb
- torch==1.8.0+cu111
- torchaudio==0.8.0
- nnAudio==0.2.6
- Create a virtual environment with python3.8 installed(
virtualenv
) git clone --recursive https://github.com/QuchenFu/Fastaudio
- use
pip install -r requirements.txt
to install the requirements files. cd leaf-audio-pytorch/
andpip install -e .
pip install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
.
├── data
│ │
│ ├── PA
│ │ └── ...
│ └── LA
│ ├── ASVspoof2019_LA_asv_protocols
│ ├── ASVspoof2019_LA_asv_scores
│ ├── ASVspoof2019_LA_cm_protocols
│ ├── ASVspoof2019_LA_train
│ ├── ASVspoof2019_LA_dev
│ └── ASVspoof2021_LA_eval
│
└── Fastaudio
- Download the data here
- Unzip and save the data to a folder
data
in the same directory asFastaudio
python3.8 preprocess.py
- Change
args['data_type'] = ['labeled','unlabeled'][1]
inpreprocess.py
toargs['data_type'] = ['labeled','unlabeled'][0]
python3.8 preprocess.py
python3.8 train_spoofspeech.py yaml/SpoofSpeechClassifier.yaml --data_parallel_backend --data_parallel_count=2
- Modify the
TRAIN
intrain_spoofspeech.py
toFalse
. python3.8 train_spoofspeech.py yaml/SpoofSpeechClassifier.yaml --data_parallel_backend --data_parallel_count=2
python3.8 eval.py
min t−DCF =min{βPcm (s)+Pcm(s)}
If you use this repository, please consider citing:
@inproceedings{Fu2021FastAudioAL,
title={FastAudio: A Learnable Audio Front-End for Spoof Speech Detection},
author={Quchen Fu and Zhongwei Teng and Jules White and M. Powell and Douglas C. Schmidt},
booktitle={2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2022},
organization={IEEE}
}
@inproceedings{Teng2021ComplementingHF,
title={Complementing Handcrafted Features with Raw Waveform Using a Light-weight Auxiliary Model},
author={Zhongwei Teng and Quchen Fu and Jules White and M. Powell and Douglas C. Schmidt},
year={2021}
}