Linguistic: Real-Time Text to Speech and Voice Cloning AI

Welcome to the Linguistic project! This project leverages multiple TTS (Text to Speech) models, including Google Text to Speech (gTTS), Bark TTS, and Coqui TTS, to provide simple real-time text-to-speech generation and extensive voice cloning capabilities for free.

Features

Real-Time Text to Speech: Convert text to speech in real-time using Google TTS.
Generative Text to Speech: Generate high-quality speech using Bark and Coqui TTS models.
Voice Cloning: Clone voices by providing a sample .wav file.
User-Friendly Interface: Intuitive web interface to easily interact with the TTS models.

Working Demo

WorkingDemo_Linguistic.mp4

Technology Stack

Backend: Flask, PyTorch
Frontend: HTML, CSS, JavaScript
TTS Models: Google TTS (gTTS), Bark TTS, Coqui TTS

Installation

Clone the Repository:

git clone https://github.com/yourusername/linguistic.git
cd linguistic

Create a Virtual Environment:

python3 -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install Dependencies:
```
pip install -r requirements.txt
```
Download TTS Models: Download and place the required TTS models (Bark TTS and Coqui TTS) in the appropriate directory.

Usage

Run the Flask Application:
```
flask run
```
Access the Web Interface: Open your web browser and navigate to http://127.0.0.1:5000/.

Project Structure

app.py: Main application file containing the Flask routes and logic.
templates/: Directory containing HTML templates for the web pages.
- home.html: Home page with navigation options.
- index.html: Page for Generative TTS.
- gtts.html: Page for Google TTS.
static/: Directory for static files (CSS, JavaScript, etc.).

Flask Routes

Home Page (/):
- Description: Welcome page with navigation buttons to TTS and gTTS pages.
- Template: home.html
Generative TTS (/tts):
- Description: Page for generating speech using Bark and Coqui TTS models.
- Template: index.html
- Method: POST
- Form Inputs:
  - text: Text to be converted to speech.
  - file: Upload a .wav file for voice cloning.
  - language: Language code (default is en).
- Output: Generates and downloads the speech file.
Google TTS (/gtts):
- Description: Page for generating speech using Google TTS.
- Template: gtts.html
- Method: POST
- Form Inputs:
  - text: Text to be converted to speech.
- Output: Generates and plays the speech file in the browser.

How to Contribute

Fork the Repository: Click on the "Fork" button at the top right of the repository page.
Clone the Forked Repository: Clone the forked repository to your local machine.
Create a New Branch: Create a new branch for your feature or bugfix.
```
git checkout -b feature-name
```
Make Changes: Make your changes in the new branch.

Commit and Push: Commit and push your changes to the new branch.

git add .
git commit -m "Description of changes"
git push origin feature-name

Create a Pull Request: Go to the original repository and create a pull request from your fork.

License

This project is licensed under the MIT License. See the LICENSE file for more information.

Contributors

Special acknowledgment goes to the following developers and contributors for their invaluable contributions to the web page development:

Acknowledgments

Special thanks to the developers and contributors of the following libraries and models:

🚀 About Me

I am an AI Specialist and Data Engineer at Navikenz & growing Android Developer (kotlin). Both the fields, Machine Learning and Android Development, fascinates me a lot. And I also have worked on Azure Cloud Computing platform to deploy machine learning models.

To know more about me, type "Bitan Paul" on your google search.

thebitanpaul/Linguistic