A docker file based on NVidia 23.04 image is available under docker/Dockerfile, with all the dependencies and libraries for running experiments. Otherwise, follow a manual install:
Create a virtual environment and activate it
sudo apt-get update
sudo apt-get install -y python3-pip python3-dev python3-tk
sudo pip3 install -U virtualenv
virtualenv --system-site-packages -p python3 ~/torch21
source ~/torch21/bin/activate
Install Python packages
pip3 install --upgrade pip
pip3 install -r requirements.txt
For replicating results, settings are stored in config/s3lspeech.py.
Download datasets
python3 main.py --run download
Pretrain the model
python3 main.py --run pretrain
Finetune the model for ASR
python3 main.py --run finetune
Results are stored in the log s3lspeech_results.pt_log. The pretrained and finetuned checkpoints are stored under data/.
This project is licensed under the terms of the MIT license. See the LICENSE file for more information.