/asr_project

DeepSpeech2 Implementation

Primary LanguagePythonMIT LicenseMIT

ASR project

Implemented DeepSpeech2.

Installation guide

Firstly clone repo and install requirements.

git clone https://github.com/diddone/asr_project
cd asr_project
pip install -r requirements.txt
pip install gdown
python3 download_model_and_config.py

Download model and lm

mkdir lm
mkdir defaul_test_model
wget http://www.openslr.org/resources/11/3-gram.arpa.gz -P lm/
python3 download_model_and_config.py
python3 download_lm.py

Training summary:

  • Optimizer: Adam
  • Scheduler: OneCycleLR
  • Model: DeepSpeech2 (1 conv and 6 lstms with layernorms)
  • Dataset: full LibriSpeech
  • Evaluation: LM with beamsearch

All training details can be found in report.

Results

Provided results for other part.

  • Argmax WER 0.2651159555363254
  • Argmax CER 0.10502687285110361
  • LM WER 0.18848792641611095
  • LM CER 0.08801816818572956

Wandb Report Link

Link to wandb report with plots and metrics