Issues regarding discrete WavLM and discrete HuBERT
anupsingh15 opened this issue · 4 comments
Describe the bug
Hi,
I am trying to extract tokens using the modules: speechbrain.lobes.models.huggingface_transformers.discrete_wavlm module
and speechbrain.lobes.models.huggingface_transformers.discrete_hubert module
, but neither seems to work due to missing model checkpoints in the SpeechBrain repo on HF. Could you please let me know of any workaround to get discrete tokens using WavLM/HuBERT?
Expected behaviour
Successful load of pre-trained models
To Reproduce
No response
Environment Details
No response
Relevant Log Output
No response
Additional Context
No response
Hello @anupsingh15, You're right, the Kmeans HF repository is currently inaccessible due to an ongoing refactoring of the interface. A workaround could be to train your own K-means model (see SpeechBrain's LibriSpeech quantization recipe). Alternatively, I have just uploaded a list of pre-trained K-means model to my HF account (repository) that you can use until the new interface is merged.
Thanks @Chaanks. Do you plan to upload KMeans models with 1024 cluster centroids for HuBERT and WavLM? I am training the KMeans models for the same as you suggested; however, I run out of the GPU memory due to limited resources.
Any news @Chaanks on that?
We have uploaded the models with 1000/2000 clusters for different layers in our own repo.. We plan to move all the trained kmeans to the speech \brain repo once the refactoring is done. Here you could find various K-means model trained 👍🏻 https://huggingface.co/poonehmousavi/SSL_Quantization/