Pinned Repositories
ar-vits
text to speech using autoregressive transformer and VITS
audio-preprocessing-scripts
数据集自动化制作脚本
descript-audio-vae
VAE modified from Descript Audio Codec, which replaces the RVQ with VAE
diff-svc
An Implementation of Singing Voice Conversion Based on Diffsinger
emotional-vits
无需情感标注的情感可控语音合成模型,基于VITS
FreeSVC
基于FreeVC的歌声转换
MagVITS
VITS with phoneme-level prosody modeling based on MaskGIT
MB-iSTFT-VITS
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
VISinger2-nomidi
vispeech
基于vits fastspeech2 visinger的tts模型
innnky's Repositories
innnky/emotional-vits
无需情感标注的情感可控语音合成模型,基于VITS
innnky/ar-vits
text to speech using autoregressive transformer and VITS
innnky/MagVITS
VITS with phoneme-level prosody modeling based on MaskGIT
innnky/descript-audio-vae
VAE modified from Descript Audio Codec, which replaces the RVQ with VAE
innnky/VISinger2-nomidi
innnky/majobroom
Majo's Broom Mod
innnky/glow-svc
singing voice conversion based on glow-tts
innnky/whisper-phoneme-asr
innnky/MQTTS
mandarin version of MQTTS
innnky/Bert-VITS2
vits2 backbone with bert
innnky/AutoVocoder
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing
innnky/Diffusion-SVC
innnky/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
innnky/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Forked and maintained by the OpenVPI community
innnky/edm2
Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)
innnky/HierTTS
innnky/SoundStorm
The reproduced code for Google's SoundStorm
innnky/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
innnky/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
innnky/VoiceFlow-TTS
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
innnky/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
innnky/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
innnky/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
innnky/rvc
innnky/SimpleSpeech
The open source code for SimpleSpeech series
innnky/SNAC
Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech
innnky/SoundStorm-pytorch
Google's SoundStorm: Efficient Parallel Audio Generation
innnky/soundstorm-speechtokenizer
Implementation of SoundStorm built upon SpeechTokenizer.
innnky/stable-audio-tools
Generative models for conditional audio generation
innnky/USLM
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"