PTL2-DS2ish

This repository takes AssemblyAI's end-to-end ASR tutorial by Michael Nguyen as a starting point and converts the training code to be compatible with the latest PyTorch Lightning release (2.0.2 as of May 2023), mainly relying on the official PyTorch Lightning preparation guide and the DNN Beamformer example by Zhaoheng Ni in torchaudio as a template.

Motivation

The motivation for this repository is to have an easily-understandable test harness/template for future ASR experiments, hence the choice to use a relatively small/simple model such as the (adapted) Deep Speech 2 implementation found in the AssemblyAI tutorial as a starting point.

Roadmap

May 21, 2023

Verified that adapted code in PyTorch Lightning behaves as original pure PyTorch code (reduced max_epochs in runs 2-5 down to 30 for faster training time).

seed epochs WER (PyTorch Lightning code) WER (AssemblyAI code)

1 100 0.2897 0.2922

2 30 0.3373 0.3421

3 30 0.3431 0.3416

4 30 0.3408 0.3445

5 30 0.3464 0.3452

seed	epochs	WER (PyTorch Lightning code)	WER (AssemblyAI code)
1	100	0.2897	0.2922
2	30	0.3373	0.3421
3	30	0.3431	0.3416
4	30	0.3408	0.3445
5	30	0.3464	0.3452

fauxneticien/PTL2-DS2ish

PTL2-DS2ish

Motivation

Roadmap

May 21, 2023