Train Text-To-Speech models using automated dataset generation techniques, such as smart audio splitting with silence detection, and transcription using Whisper.
The datasets generated are in the LJSpeech single-speaker dataset format: https://keithito.com/LJ-Speech-Dataset
Mamba Package Manager: https://github.com/conda-forge/miniforge#mambaforge
Before running, make sure you go through all of the README files located in this repository. You can find them linked below:
python -m src.bin.generate_data --input_folder <INPUT FOLDER NAME>
Example: for speaker LJ001, run python -m src.bin.generate_data --input_folder LJ001
Information coming soon.