RNNT-pytorch: A Python repository from amathews-amd

RNNT-pytorch

Implementation of RNN-Transducer

Installation

pip isntall -r requirments.txt
Install torch
Install rnnt loss
install torch audio

## ref hawk aron's read me
git clone https://github.com/HawkAaron/warp-transducer
cd warp-transducer
mkdir build; cd build
cmake ..
make

cd pytorch_binding
python setup.py install

Train Decoder (optional)

python train_decoder_LM.py --train-manifest ./data/LM/train_LM.txt

Train Network

python train.py --val-manifest {your val manifest csv path} --train-manifest {your train manifest csv path

Results

Data	Parameter Setting	WER	CER
an4	3encoder, 2decoder, 250 hidden size, 0.2 drop out	25.06	19.2
an4	+augmentation + batch normalization	18.11	13.72
an4	+specAugment	12.14	10.4

Things To Do

입력으로 사용하는 특징들을 spectrogram, filter bank, word piece 종류 늘리기.
네트워크 구조 다듬기.
LM 선학습 후 사용 가능하게 하기.

References

EXPLORING RNN-TRANSDUCER FOR CHINESE SPEECH RECOGNITION
speech, RNNT Loss by awni
E2E-ASR by hawk aron
EXPLORING ARCHITECTURES, DATA AND UNITS FOR STREAMING END-TO-END SPEECH RECOGNITION WITH RNN-TRANSDUCER
A Comparison of Sequence-to-Sequence Models for Speech Recognition

amathews-amd/RNNT-pytorch

RNNT-pytorch

Installation

Train Decoder (optional)

Train Network

Results

Things To Do

References