Code for training SpeechBrain embedding model based on the VoxLingua107 dataset.
Since Speechbrain recently added official support for WebDataset, this code will probably be changed, and I'll try to get it integrated into SpeechBrain.
The code is heavily inspired by this: https://github.com/nikvaessen/speechbrain/tree/sharded-voxceleb/my-recipes/SpeakerRec
This code was not really meant for sharing as is, I am only publishing it because several persons asked for it. So, I won't accept PRs, etc.