Please refer to https://icefall.readthedocs.io/en/latest/installation/index.html for installation.
Please refer to https://icefall.readthedocs.io/en/latest/recipes/index.html for more information.
We provide four recipes at present:
This is the simplest ASR recipe in icefall
and can be run on CPU.
Training takes less than 30 seconds and gives you the following WER:
[test_set] %WER 0.42% [1 / 240, 0 ins, 1 del, 0 sub ]
We do provide a Colab notebook for this recipe.
We provide 4 models for this recipe:
- conformer CTC model
- TDNN LSTM CTC model
- Transducer: Conformer encoder + LSTM decoder
- Transducer: Conformer encoder + Embedding decoder
The best WER we currently have is:
test-clean | test-other | |
---|---|---|
WER | 2.42 | 5.73 |
We provide a Colab notebook to run a pre-trained conformer CTC model:
The WER for this model is:
test-clean | test-other | |
---|---|---|
WER | 6.59 | 17.69 |
We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model:
Using Conformer as encoder and LSTM as decoder.
The best WER with greedy search is:
test-clean | test-other | |
---|---|---|
WER | 3.07 | 7.51 |
We provide a Colab notebook to run a pre-trained RNN-T conformer model:
Using Conformer as encoder. The decoder consists of 1 embedding layer and 1 convolutional layer.
The best WER using beam search with beam size 4 is:
test-clean | test-other | |
---|---|---|
WER | 2.83 | 7.19 |
Note: No auxiliary losses are used in the training and no LMs are used in the decoding.
We provide a Colab notebook to run a pre-trained transducer conformer + stateless decoder model:
We provide two models for this recipe: conformer CTC model and TDNN LSTM CTC model.
The best CER we currently have is:
test | |
---|---|
CER | 4.26 |
We provide a Colab notebook to run a pre-trained conformer CTC model:
The best CER we currently have is:
test | |
---|---|
CER | 5.7 |
We provide a Colab notebook to run a pre-trained TransducerStateless model:
The CER for this model is:
test | |
---|---|
CER | 10.16 |
We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model:
We provide two models for this recipe: TDNN LSTM CTC model and TDNN LiGRU CTC model.
The best PER we currently have is:
TEST | |
---|---|
PER | 19.71% |
We provide a Colab notebook to run a pre-trained TDNN LSTM CTC model:
The PER for this model is:
TEST | |
---|---|
PER | 17.66% |
We provide a Colab notebook to run a pre-trained TDNN LiGRU CTC model:
Once you have trained a model in icefall, you may want to deploy it with C++, without Python dependencies.
Please refer to the documentation https://icefall.readthedocs.io/en/latest/recipes/librispeech/conformer_ctc.html#deployment-with-c for how to do this.
We also provide a Colab notebook, showing you how to run a torch scripted model in k2 with C++. Please see: