text-to-audio

There are 49 repositories under text-to-audio topic.

  • Amphion

    open-mmlab/Amphion

    Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

    Language:Python8k76229614
  • tango

    declare-lab/tango

    A family of diffusion models for text-to-audio generation.

    Language:Python1.1k285094
  • gitmylo/audio-webui

    A webui for different audio related Neural Networks

    Language:Python1.1k24197103
  • ictnlp/StreamSpeech

    StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

    Language:Python991131676
  • hkchengrex/MMAudio

    [arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

    Language:Python920112596
  • Text-to-Audio/Make-An-Audio

    PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model

    Language:Python7637114112
  • lucidrains/nuwa-pytorch

    Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch

    Language:Python54623957
  • ivcylc/OpenMusic

    OpenMusic: SOTA Text-to-music (TTM) Generation

    Language:Python521101450
  • declare-lab/TangoFlux

    TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching

    Language:Jupyter Notebook50245
  • YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

    🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).

    Language:HTML40515523
  • mustango

    AMAAI-Lab/mustango

    Mustango: Toward Controllable Text-to-Music Generation

    Language:Python346161628
  • haidog-yaqub/EzAudio

    High-quality Text-to-Audio Generation with Efficient Diffusion Transformer

    Language:Python2521859
  • happylittlecat2333/Auffusion

    Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"

    Language:Jupyter Notebook16791113
  • ilaria-manco/word2wave

    Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.

    Language:Python1193315
  • bnsantoso/sub-to-audio

    Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.

    Language:Python10851913
  • sony/soundctm

    Pytorch implementation of SoundCTM

    Language:Python75326
  • keonlee9420/WaveGrad2

    PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

    Language:Python676517
  • serp-ai/ai-text-to-audio-latent-diffusion

    text-to-audio-latent-diffusion

    Language:Python37618
  • RhythrosaLabs/soundstorm

    Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.

    Language:Python30337
  • PapayaResearch/ctag

    Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24

    Language:Python21302
  • camenduru/audioldm-colab

    AudioLDM text to audio colab

    Language:Jupyter Notebook19313
  • kennethleungty/Text-to-Audio-with-Bark

    Exploring Bark, the Open-Source Text-to-Audio Generative Model

    Language:Jupyter Notebook15304
  • Djmcflush/RaveFussion

    A text to audio pipeline using Riffusion (a finetuned stablediffusion model) and using RAVE a audio to audio AutoEncoder.

    Language:Python14117
  • GabrieleRisso/aiyu

    core shell functions building blocks for advanced AI pipelines

  • Consistency-TTA/consistency-tta.github.io

    Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation

    Language:HTML7100
  • XiaomingX/awesome-ai-tools-for-game-dev

    Awesome AI Tools for Game Development: A curated collection of the best AI tools, libraries, and resources to enhance game development workflows. From procedural content generation to NPC behavior, this repository gathers state-of-the-art AI solutions for game developers.

  • hkchengrex/av-benchmark

    Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs, LAION-CLAP, MS-CLAP, DeSync

    Language:Python5
  • vishalnagda1/text-to-speech

    Python program to convert text to speech.

    Language:Python52217
  • ahsplore/TalkitOut-TTS-web-application-python

    TalkItOut is a Python and Flask-based web application that can convert text to speech, choose your preferred language for audio output, access a built-in dictionary for word meanings, and even extract text from images, complete with audio generation.

    Language:HTML4112
  • inferless/bark

    Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying.

    Language:Python42011
  • Ate329/SentiMusic

    A text-to-audio application that turns words and sentiments into melodies.

    Language:Python3200
  • dimitreOliveira/GenAI-GeoGuesser

    Generative AI version of the GeoGuesser game.

    Language:Python310
  • Yazdi9/Text-To-Audio-ChatGPT

    Text To Audio (Voice, Music) -Support Chat-GPT

    Language:Python3201
  • brayanjeshua/chatgpt-to-speech

    CHATGPT Text-to-Speech Application

    Language:JavaScript2100
  • mohaimenulislamshawon/text-to-voice-speech-converter

    The program is created based on google text to speech or voice converter machine. You can convert top 20 languages with this convert. I have made this for the educational & experimental perpose.

    Language:HTML2101