MisakaMikoto96's Stars
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
coqui-ai/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
bmaltais/kohya_ss
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
HumanAIGC/OutfitAnyone
Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person
yl4579/StyleTTS2
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
Akegarasu/lora-scripts
LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.
rosinality/vq-vae-2-pytorch
Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
wangkai930418/awesome-diffusion-categorized
collection of diffusion model papers categorized by their subareas
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
152334H/tortoise-tts-fast
Fast TorToiSe inference (5x or your money back!)
lucidrains/MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
OlaWod/FreeVC
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
facebookresearch/AudioMAE
This repo hosts the code and models of "Masked Autoencoders that Listen".
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
adefossez/julius
Fast PyTorch based DSP for audio and 1D signals
v-iashin/SpecVQGAN
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
rhasspy/gruut
A tokenizer, text cleaner, and phonemizer for many human languages.
innnky/ar-vits
text to speech using autoregressive transformer and VITS
zyzisyz/mfa_conformer
X-LANCE/UniCATS-CTX-vec2wav
[AAAI 2024] Code for CTX-vec2wav in UniCATS
ga642381/SpeechPrompt-v2
《SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks》Speech processing with prompting paradigm
rishikksh20/NaturalSpeech2
elevenlabs/elevenlabs-docs
Documentation for elevenlabs.io/docs
PlayVoice/BigVGAN
BigVGAN with Neural Source-Filter
Pranjalya/tts-tortoise-gradio
A Gradio setup for Tortoise TTS.
canberk17/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.