self-training-for-asr-hy notebook covers comlete 1st iteration of Noizy Student training for Armenian language. The training is done using HuggingFace🤗. LM boosted decoding is based on pyctcdecode.
This charts generated by the above notebook illustrate the effect of self-training. 1st iteration results in relative imrovement 18.2% and 28.4% for WER and loss respectively.
self-training-for-asr-pseudo-labeling notebook covers pseudo-labeled dataset generation.
Noizy student can be repeated multiple times having potential for further performance improvement. A larger model based on wav2vec2-xls-r-1b
trained for 4 iteration using this approach is available here. It achieves 10.81 WER with LM boosted decoding (the best open-source armenian ASR model to my knowledge).