tud-zih-energy/voice-annotation-tool

Add speech model training functionality using Coqui

Opened this issue · 0 comments

Add a way to train a language model with Coqui using the samples loaded inside the project.

Audio files will need to be converted to .wav and should be saved inside a chache folder, which the user should select.
The converted files should be named so that it's obvious which sample rate they have.

The first step would be to generate audio files and transcription files in a format Coqui can use.
Here are the instructions on how to do so: https://stt.readthedocs.io/en/latest/COMMON_VOICE_DATA.html