/elevenlabs-python

The official Python API for ElevenLabs text-to-speech.

Primary LanguagePython

LOGO

Discord Twitter PyPI - Python Version Downloads

The official Python API for ElevenLabs text-to-speech software. Eleven brings the most compelling, rich and lifelike voices to creators and developers in just a few lines of code.

⚙️ Install

pip install elevenlabs

🗣️ Usage

Open in Spaces Open In Colab

We support two main models: the newest eleven_multilingual_v2, a single foundational model supporting 28 languages including English, Chinese, Spanish, Hindi, Portuguese, French, German, Japanese, Arabic, Korean, Indonesian, Italian, Dutch, Turkish, Polish, Swedish, Filipino, Malay, Romanian, Ukrainian, Greek, Czech, Danish, Finnish, Bulgarian, Croatian, Slovak, and Tamil; and eleven_monolingual_v1, a low-latency model specifically trained for English speech.

from elevenlabs import generate, play

audio = generate(
  text="Hello! 你好! Hola! नमस्ते! Bonjour! こんにちは! مرحبا! 안녕하세요! Ciao! Cześć! Привіт! வணக்கம்!",
  voice="Bella",
  model="eleven_multilingual_v2"
)

play(audio)
Play

Don't forget to unmute the player!

audio.3.webm

🗣️ Voices

List all your available voices with voices().

from elevenlabs import voices, generate

voices = voices()
audio = generate(text="Hello there!", voice=voices[0])
print(voices)
Show output
Voices(
    voices=[
        Voice(
            voice_id='21m00Tcm4TlvDq8ikWAM',
            name='Rachel',
            category='premade',
            settings=None,
        ),
        Voice(
            voice_id='AZnzlk1XvdvUeBnXmlld',
            name='Domi',
            category='premade',
            settings=None,
        ),
        ...
    ]
)

Build a voice object with custom settings to personalize the voice style, or call voice.fetch_settings() to get the default settings for the voice.

from elevenlabs import Voice, VoiceSettings, generate

audio = generate(
    text="Hello! My name is Bella.",
    voice=Voice(
        voice_id='EXAVITQu4vr4xnSDxMaL',
        settings=VoiceSettings(stability=0.71, similarity_boost=0.5, style=0.0, use_speaker_boost=True)
    )
)

play(audio)

Clone Voice

Clone your voice in an instant. Note that voice cloning requires an API key, see below.

from elevenlabs import clone, generate, play

voice = clone(
    name="Alex",
    description="An old American male voice with a slight hoarseness in his throat. Perfect for news", # Optional
    files=["./sample_0.mp3", "./sample_1.mp3", "./sample_2.mp3"],
)

audio = generate(text="Hi! I'm a cloned voice!", voice=voice)

play(audio)

🚿 Streaming

Stream audio in real-time, as it's being generated.

from elevenlabs import generate, stream

audio_stream = generate(
  text="This is a... streaming voice!!",
  stream=True
)

stream(audio_stream)

Input streaming

Stream text chunks into audio as it's being generated, with <1s latency. Note: if chunks don't end with space or punctuation (" ", ".", "?", "!"), the stream will wait for more text.

from elevenlabs import generate, stream

def text_stream():
    yield "Hi there, I'm Eleven "
    yield "I'm a text to speech API "

audio_stream = generate(
    text=text_stream(),
    voice="Nicole",
    model="eleven_monolingual_v1",
    stream=True
)

stream(audio_stream)

🔑 API Key

The basic API has a limited number of characters. To increase this limit, you can get a free API key from Elevenlabs (step-by-step guide) and set is as environment variable ELEVEN_API_KEY. Alternatively, you can provide the api_key string argument to the generate function, or set it globally in code with:

from elevenlabs import set_api_key
set_api_key("<YOUR_API_KEY>")

📖 API & Docs

Learn more about the Python API, or check out the HTTP API documentation.