text-to-audio

There are 49 repositories under text-to-audio topic.

open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Language:Python8k 76 229614
declare-lab/tango
A family of diffusion models for text-to-audio generation.
Language:Python1.1k 28 5094
gitmylo/audio-webui
A webui for different audio related Neural Networks
Language:Python1.1k 24 197103
ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
Language:Python991 13 1676
hkchengrex/MMAudio
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Language:Python920 11 2596
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
Language:Python763 71 14112
lucidrains/nuwa-pytorch
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch
Language:Python546 23 957
ivcylc/OpenMusic
OpenMusic: SOTA Text-to-music (TTM) Generation
Language:Python521 10 1450
declare-lab/TangoFlux
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
Language:Jupyter Notebook50245
YingqingHe/Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
Language:HTML405 15 523
AMAAI-Lab/mustango
Mustango: Toward Controllable Text-to-Music Generation
Language:Python346 16 1628
haidog-yaqub/EzAudio
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
Language:Python252 18 59
happylittlecat2333/Auffusion
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
Language:Jupyter Notebook167 9 1113
ilaria-manco/word2wave
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
Language:Python119 3 315
bnsantoso/sub-to-audio
Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.
Language:Python108 5 1913
sony/soundctm
Pytorch implementation of SoundCTM
Language:Python75 3 26
keonlee9420/WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Language:Python67 6 517
serp-ai/ai-text-to-audio-latent-diffusion
text-to-audio-latent-diffusion
Language:Python37 6 18
RhythrosaLabs/soundstorm
Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.
Language:Python30 3 37
PapayaResearch/ctag
Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24
Language:Python21 3 02
camenduru/audioldm-colab
AudioLDM text to audio colab
Language:Jupyter Notebook19 3 13
kennethleungty/Text-to-Audio-with-Bark
Exploring Bark, the Open-Source Text-to-Audio Generative Model
Language:Jupyter Notebook15 3 04
Djmcflush/RaveFussion
A text to audio pipeline using Riffusion (a finetuned stablediffusion model) and using RAVE a audio to audio AutoEncoder.
Language:Python14 1 17
GabrieleRisso/aiyu
core shell functions building blocks for advanced AI pipelines
13 3 00
Consistency-TTA/consistency-tta.github.io
Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
Language:HTML7 1 00
XiaomingX/awesome-ai-tools-for-game-dev
Awesome AI Tools for Game Development: A curated collection of the best AI tools, libraries, and resources to enhance game development workflows. From procedural content generation to NPC behavior, this repository gathers state-of-the-art AI solutions for game developers.
7
hkchengrex/av-benchmark
Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs, LAION-CLAP, MS-CLAP, DeSync
Language:Python5
vishalnagda1/text-to-speech
Python program to convert text to speech.
Language:Python5 2 217
ahsplore/TalkitOut-TTS-web-application-python
TalkItOut is a Python and Flask-based web application that can convert text to speech, choose your preferred language for audio output, access a built-in dictionary for word meanings, and even extract text from images, complete with audio generation.
Language:HTML4 1 12
inferless/bark
Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying.
Language:Python4 2 011
Ate329/SentiMusic
A text-to-audio application that turns words and sentiments into melodies.
Language:Python3 2 00
dimitreOliveira/GenAI-GeoGuesser
Generative AI version of the GeoGuesser game.
Language:Python3 1 0
Yazdi9/Text-To-Audio-ChatGPT
Text To Audio (Voice, Music) -Support Chat-GPT
Language:Python3 2 01
artinmohajeri/tkinter-text-to-voice
Language:Python2 1 0
brayanjeshua/chatgpt-to-speech
CHATGPT Text-to-Speech Application
Language:JavaScript2 1 00
mohaimenulislamshawon/text-to-voice-speech-converter
The program is created based on google text to speech or voice converter machine. You can convert top 20 languages with this convert. I have made this for the educational & experimental perpose.
Language:HTML2 1 01

text-to-audio

open-mmlab/Amphion

declare-lab/tango

gitmylo/audio-webui

ictnlp/StreamSpeech

hkchengrex/MMAudio

Text-to-Audio/Make-An-Audio

lucidrains/nuwa-pytorch

ivcylc/OpenMusic

declare-lab/TangoFlux

YingqingHe/Awesome-LLMs-meet-Multimodal-Generation

AMAAI-Lab/mustango

haidog-yaqub/EzAudio

happylittlecat2333/Auffusion

ilaria-manco/word2wave

bnsantoso/sub-to-audio

sony/soundctm

keonlee9420/WaveGrad2

serp-ai/ai-text-to-audio-latent-diffusion

RhythrosaLabs/soundstorm

PapayaResearch/ctag

camenduru/audioldm-colab

kennethleungty/Text-to-Audio-with-Bark

Djmcflush/RaveFussion

GabrieleRisso/aiyu

Consistency-TTA/consistency-tta.github.io

XiaomingX/awesome-ai-tools-for-game-dev

hkchengrex/av-benchmark

vishalnagda1/text-to-speech

ahsplore/TalkitOut-TTS-web-application-python

inferless/bark

Ate329/SentiMusic

dimitreOliveira/GenAI-GeoGuesser

Yazdi9/Text-To-Audio-ChatGPT

artinmohajeri/tkinter-text-to-voice

brayanjeshua/chatgpt-to-speech

mohaimenulislamshawon/text-to-voice-speech-converter