text-to-audio
There are 49 repositories under text-to-audio topic.
open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
declare-lab/tango
A family of diffusion models for text-to-audio generation.
gitmylo/audio-webui
A webui for different audio related Neural Networks
ictnlp/StreamSpeech
StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
hkchengrex/MMAudio
[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Text-to-Audio/Make-An-Audio
PyTorch Implementation of Make-An-Audio (ICML'23) with a Text-to-Audio Generative Model
lucidrains/nuwa-pytorch
Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch
ivcylc/OpenMusic
OpenMusic: SOTA Text-to-music (TTM) Generation
declare-lab/TangoFlux
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
YingqingHe/Awesome-LLMs-meet-Multimodal-Generation
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
AMAAI-Lab/mustango
Mustango: Toward Controllable Text-to-Music Generation
haidog-yaqub/EzAudio
High-quality Text-to-Audio Generation with Efficient Diffusion Transformer
happylittlecat2333/Auffusion
Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"
ilaria-manco/word2wave
Word2Wave: a framework for generating short audio samples from a text prompt using WaveGAN and COALA.
bnsantoso/sub-to-audio
Subtitle to audio, generate audio from any subtitle file using Coqui-ai TTS and synchronize the audio timing according to subtitle time.
sony/soundctm
Pytorch implementation of SoundCTM
keonlee9420/WaveGrad2
PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
serp-ai/ai-text-to-audio-latent-diffusion
text-to-audio-latent-diffusion
RhythrosaLabs/soundstorm
Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.
PapayaResearch/ctag
Creative Text-to-Audio Generation via Synthesizer Programming @ ICML'24
camenduru/audioldm-colab
AudioLDM text to audio colab
kennethleungty/Text-to-Audio-with-Bark
Exploring Bark, the Open-Source Text-to-Audio Generative Model
Djmcflush/RaveFussion
A text to audio pipeline using Riffusion (a finetuned stablediffusion model) and using RAVE a audio to audio AutoEncoder.
GabrieleRisso/aiyu
core shell functions building blocks for advanced AI pipelines
Consistency-TTA/consistency-tta.github.io
Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation
XiaomingX/awesome-ai-tools-for-game-dev
Awesome AI Tools for Game Development: A curated collection of the best AI tools, libraries, and resources to enhance game development workflows. From procedural content generation to NPC behavior, this repository gathers state-of-the-art AI solutions for game developers.
hkchengrex/av-benchmark
Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs, LAION-CLAP, MS-CLAP, DeSync
vishalnagda1/text-to-speech
Python program to convert text to speech.
ahsplore/TalkitOut-TTS-web-application-python
TalkItOut is a Python and Flask-based web application that can convert text to speech, choose your preferred language for audio output, access a built-in dictionary for word meanings, and even extract text from images, complete with audio generation.
inferless/bark
Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying.
Ate329/SentiMusic
A text-to-audio application that turns words and sentiments into melodies.
dimitreOliveira/GenAI-GeoGuesser
Generative AI version of the GeoGuesser game.
Yazdi9/Text-To-Audio-ChatGPT
Text To Audio (Voice, Music) -Support Chat-GPT
brayanjeshua/chatgpt-to-speech
CHATGPT Text-to-Speech Application
mohaimenulislamshawon/text-to-voice-speech-converter
The program is created based on google text to speech or voice converter machine. You can convert top 20 languages with this convert. I have made this for the educational & experimental perpose.