/icefall

Primary LanguagePythonOtherNOASSERTION

Installation

Please refer to https://icefall.readthedocs.io/en/latest/installation/index.html for installation.

Recipes

Please refer to https://icefall.readthedocs.io/en/latest/recipes/index.html for more information.

We provide four recipes at present:

yesno

This is the simplest ASR recipe in icefall and can be run on CPU. Training takes less than 30 seconds and gives you the following WER:

[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]

We do provide a Colab notebook for this recipe.

Open In Colab

LibriSpeech

We provide 4 models for this recipe:

Conformer CTC Model

The best WER we currently have is:

test-clean test-other
WER 2.42 5.73

We provide a Colab notebook to run a pre-trained conformer CTC model: Open In Colab

TDNN LSTM CTC Model

The WER for this model is:

test-clean test-other
WER 6.59 17.69

We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: Open In Colab

Transducer: Conformer encoder + LSTM decoder

Using Conformer as encoder and LSTM as decoder.

The best WER with greedy search is:

test-clean test-other
WER 3.07 7.51

We provide a Colab notebook to run a pre-trained RNN-T conformer model: Open In Colab

Transducer: Conformer encoder + Embedding decoder

Using Conformer as encoder. The decoder consists of 1 embedding layer and 1 convolutional layer.

The best WER using modified beam search with beam size 4 is:

test-clean test-other
WER 2.56 6.27

Note: No auxiliary losses are used in the training and no LMs are used in the decoding.

We provide a Colab notebook to run a pre-trained transducer conformer + stateless decoder model: Open In Colab

Aishell

We provide two models for this recipe: conformer CTC model and TDNN LSTM CTC model.

Conformer CTC Model

The best CER we currently have is:

test
CER 4.26

We provide a Colab notebook to run a pre-trained conformer CTC model: Open In Colab

Transducer Stateless Model

The best CER we currently have is:

test
CER 4.68

We provide a Colab notebook to run a pre-trained TransducerStateless model: Open In Colab

TDNN LSTM CTC Model

The CER for this model is:

test
CER 10.16

We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: Open In Colab

TIMIT

We provide two models for this recipe: TDNN LSTM CTC model and TDNN LiGRU CTC model.

TDNN LSTM CTC Model

The best PER we currently have is:

TEST
PER 19.71%

We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model: Open In Colab

TDNN LiGRU CTC Model

The PER for this model is:

TEST
PER 17.66%

We provide a Colab notebook to run a pre-trained TDNN LiGRU CTC model: Open In Colab

Deployment with C++

Once you have trained a model in icefall, you may want to deploy it with C++, without Python dependencies.

Please refer to the documentation https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html#deployment-with-c for how to do this.

We also provide a Colab notebook, showing you how to run a torch scripted model in k2 with C++. Please see: Open In Colab