🎤 XTTS Fine-Tuning on Apple Silicon M1 Pro 🖥️

Welcome to the XTTS model fine-tuning repository! This project allows you to fine-tune XTTS (Cross-lingual Text-To-Speech) models specifically optimized for Apple's M1 Pro chipset using Python 3.10.

💻 About

This project was tested on an M1 Pro Mac with 16GB RAM, focusing on the fine-tuning of XTTS models for TTS applications. The repository includes model compression techniques to optimize model performance. The main file you will run is xtts_demo_with_model_compression.py.

Compatable with ebook2audiobookxtts

🚀 Installation

Follow these steps to set up the project on your machine:

Clone the repo:

git clone [fineTuneXTTS-apple-Silicone](https://github.com/DrewThomasson/finetuneXtts_apple_silicone.git)
cd finetuneXtts_apple_silicone

Install dependencies: The installation requires the no-dependencies option since it's built from a pip freeze.
```
pip install --no-dependencies -r requirements.txt
```

🐳 Docker usage

docker run -it -v ${PWD}/training:/tmp/xtts_ft/ athomasson2/fine_tune_xtts:M1

Docker Usage on x86(You need 12 gb Vram at minimum)

docker run --gpus all -it -v ${PWD}/training:/tmp/xtts_ft/ athomasson2/fine_tune_xtts:v5

Taken from my dockerhub

🛠️ Usage

To fine-tune and run the XTTS model, use the provided demo script.

Run the Fine-Tuning Script

python3 xtts_demo_with_model_compression.py --port 5003 --out_path /your/output/path --num_epochs 6 --batch_size 2

Available Arguments:

--port: Specify the port to run Gradio demo (default: 5003)
--out_path: Output directory for saved models (default: /tmp/xtts_ft/)
--num_epochs: Number of epochs (default: 10)
--batch_size: Batch size for training (default: 4)
--grad_acumm: Gradient accumulation steps (default: 1)
--max_audio_length: Maximum audio length in seconds (default: 11)

📝 File Overview

xtts_demo_with_model_compression.py: Main script to fine-tune, load, and run the XTTS model.
train_gpt.py: Handles the GPT training aspects during fine-tuning.
format_audio_list.py: Preprocesses the dataset for training.
export_model(): Compresses and exports the fine-tuned model as a .zip file.

✨ Features

🔥 Fine-tune XTTS models efficiently on Apple Silicon.
📂 Automatically compress and export the best model after fine-tuning.
🧠 Leverage model compression to optimize performance.
🌍 Supports various languages for training and inference.

⚠️ Notes

This project was tested on an M1 Pro Mac with 16GB RAM.
Ensure that all Python packages are compatible with Apple Silicon (M1/M2) architecture.

Feel free to contribute, suggest improvements, or raise any issues. Happy fine-tuning! 😎