This is a fork of the original fairseq repository (version 0.12.2) with added classes for training mHuBERT-147: A Compact Multilingual HuBERT Model.
Find details at: https://github.com/utter-project/fairseq/tree/main/examples/mHuBERT-147
-
Pre-trained models with manifest files: https://huggingface.co/collections/utter-project/mhubert-147-models-665f1c1dea9a5601a1bfc905
-
Pre-processing and clustering scripts: https://github.com/utter-project/mHuBERT-147-scripts
@inproceedings{boito2024mhubert,
author={Marcely Zanon Boito, Vivek Iyer, Nikolaos Lagos, Laurent Besacier, Ioan Calapodescu},
title={{mHuBERT-147: A Compact Multilingual HuBERT Model}},
year=2024,
booktitle={Interspeech 2024},
}
This is an output of the European Project UTTER (Unified Transcription and Translation for Extended Reality) funded by European Union’s Horizon Europe Research and Innovation programme under grant agreement number 101070631.
For more information please visit https://he-utter.eu/