VoiceVerse is a versatile application built with Streamlit, allowing users to experience advanced text-to-speech, translation, and conversational simulations. This app leverages the ElevenLabs API for high-quality voice synthesis and deep-translator
for translation capabilities.
- Text-to-Speech: Enter text and listen to it spoken in different voices and languages.
- Translation: Translate text between supported languages.
- Translate & Speak: Translate text and generate audio in the translated language.
- Simulated Conversation: Create a back-and-forth conversation between two people with different text inputs, voices, and languages.
- Merged Conversation Audio: Simulate a conversation and merge both audio clips into a single file.
The application is organized as follows:
app/
├── main.py # Main app file with sidebar navigation
├── pages/
│ ├── speech.py # Page for basic text-to-speech
│ ├── translator.py # Page for translation
│ ├── translator_plus.py # Page for translation + TTS
│ ├── convo.py # Page for basic conversation simulation
│ └── convo_merge.py # Page for merged conversation audio
└── config.py # File for API keys and configuration
-
Clone the repository:
git clone <repository-url> cd voiceverse
-
Install dependencies:
pip install -r requirements.txt
-
Ensure you have
ffmpeg
installed on your system, as it is required bypydub
:- macOS:
brew install ffmpeg
- Ubuntu:
sudo apt-get install ffmpeg
- Windows: Download and add to system PATH from ffmpeg.org.
- macOS:
-
Set up API keys:
- In
config.py
, add your ElevenLabs API key.
- In
To start the app, run:
streamlit run app/main.py
Then, use the sidebar to navigate through the different features.
- Streamlit
- ElevenLabs Python API
- deep-translator
- pydub
- ffmpeg (for audio processing)
This project is licensed under the MIT License.