Welcome to the XTTS model fine-tuning repository! This project allows you to fine-tune XTTS (Cross-lingual Text-To-Speech) models specifically optimized for Apple's M1 Pro chipset using Python 3.10.
This project was tested on an M1 Pro Mac with 16GB RAM, focusing on the fine-tuning of XTTS models for TTS applications. The repository includes model compression techniques to optimize model performance. The main file you will run is xtts_demo_with_model_compression.py
.
- Compatable with ebook2audiobookxtts
Follow these steps to set up the project on your machine:
-
Clone the repo:
git clone [fineTuneXTTS-apple-Silicone](https://github.com/DrewThomasson/finetuneXtts_apple_silicone.git) cd finetuneXtts_apple_silicone
-
Install dependencies: The installation requires the
no-dependencies
option since it's built from apip freeze
.pip install --no-dependencies -r requirements.txt
docker run -it -v ${PWD}/training:/tmp/xtts_ft/ athomasson2/fine_tune_xtts:M1
- Docker Usage on x86(You need 12 gb Vram at minimum)
docker run --gpus all -it -v ${PWD}/training:/tmp/xtts_ft/ athomasson2/fine_tune_xtts:v5
- Taken from my dockerhub
To fine-tune and run the XTTS model, use the provided demo script.
python3 xtts_demo_with_model_compression.py --port 5003 --out_path /your/output/path --num_epochs 6 --batch_size 2
--port
: Specify the port to run Gradio demo (default: 5003)--out_path
: Output directory for saved models (default:/tmp/xtts_ft/
)--num_epochs
: Number of epochs (default: 10)--batch_size
: Batch size for training (default: 4)--grad_acumm
: Gradient accumulation steps (default: 1)--max_audio_length
: Maximum audio length in seconds (default: 11)
xtts_demo_with_model_compression.py
: Main script to fine-tune, load, and run the XTTS model.train_gpt.py
: Handles the GPT training aspects during fine-tuning.format_audio_list.py
: Preprocesses the dataset for training.export_model()
: Compresses and exports the fine-tuned model as a.zip
file.
- 🔥 Fine-tune XTTS models efficiently on Apple Silicon.
- 📂 Automatically compress and export the best model after fine-tuning.
- 🧠 Leverage model compression to optimize performance.
- 🌍 Supports various languages for training and inference.
- This project was tested on an M1 Pro Mac with 16GB RAM.
- Ensure that all Python packages are compatible with Apple Silicon (M1/M2) architecture.
Feel free to contribute, suggest improvements, or raise any issues. Happy fine-tuning! 😎