/deepspeech2-online-decoder

Online (real-time) decoder to be used with DeepSpeech2 model

Primary LanguagePythonMIT LicenseMIT

DeepSpeech2 Online (Real-time) Decoder

This project works as an extension to this DeepSpeech2 implementation.

Getting Started

Prerequisites

Since this project is an extension for DeepSpeech2, you need to follow the installation instructions mentioned there.

Installing

Simply, copy the content of this folder and paste it inside deepspeech.pytorch folder.

How To Run it?

I'll illustrate it on this pretrained acoustic model and this ARPA language model. You can find other models here.

You need to edit some files before you run the server application.

  1. Open the run_decoder_server.sh file and change the following variables:
    --lm-path: the path of the language model.
    --model-path: the path of the acoustic model.
    --port: the port that the server will be listening on.
python decoder_server.py --host 0.0.0.0 \
                         --port 8888 \
                         --lm-path /volume/3-gram.pruned.3e-7.arpa \
                         --decoder beam --alpha 1.97 --beta 4.36 \
                         --model-path /volume/librispeech_pretrained_v2.pth \
                         --beam-width 1024 \ 
                         --cuda
  1. Open js/app.js and find the following variables and change them:
    X_seconds: record and send data each X_seconds seconds.
var X_seconds = 3;

ws_ip: the IP address of the computer that runs therun_decoder_server.sh script.
ws_port: the port that you use in therun_decoder_server.sh script.

var ws_ip = '0.0.0.0'
var ws_port = '8888'
  1. Copy data/extended_data_loader.py from this project to the data folder in the deepspeech.pytorch folder.

Finally, run the following in different terminals:

> python website_server.py
> ./run_decoder_server.sh

Authors

License

This project is licensed under the MIT License - see the LICENSE file for details

Acknowledgments

Many thanks for those who made it possible for this project to be realized! This project uses the functionalities of different open-source projects that are mentioned below.