innnky

Pinned Repositories

ar-vits
text to speech using autoregressive transformer and VITS
Language:Python224 15 515
audio-preprocessing-scripts
数据集自动化制作脚本
Language:Python72 4 214
descript-audio-vae
VAE modified from Descript Audio Codec, which replaces the RVQ with VAE
Language:Python42 8 15
diff-svc
An Implementation of Singing Voice Conversion Based on Diffsinger
Language:Python70 5 413
emotional-vits
无需情感标注的情感可控语音合成模型，基于VITS
Language:Jupyter Notebook1.3k 12 34167
FreeSVC
基于FreeVC的歌声转换
Language:Python21 5 04
MagVITS
VITS with phoneme-level prosody modeling based on MaskGIT
Language:Python74 7 17
MB-iSTFT-VITS
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Language:Jupyter Notebook44 1 010
VISinger2-nomidi
Language:Python24 2 00
vispeech
基于vits fastspeech2 visinger的tts模型
Language:Python23 5 63

innnky's Repositories

innnky/emotional-vits
无需情感标注的情感可控语音合成模型，基于VITS
Language:Jupyter Notebook1.3k 12 34167
innnky/ar-vits
text to speech using autoregressive transformer and VITS
Language:Python224 15 515
innnky/MagVITS
VITS with phoneme-level prosody modeling based on MaskGIT
Language:Python74 7 17
innnky/descript-audio-vae
VAE modified from Descript Audio Codec, which replaces the RVQ with VAE
Language:Python42 8 15
innnky/VISinger2-nomidi
Language:Python24 2 00
innnky/majobroom
Majo's Broom Mod
Language:Java17 3 129
innnky/glow-svc
singing voice conversion based on glow-tts
Language:Python11 4 22
innnky/whisper-phoneme-asr
Language:Python9 2 01
innnky/MQTTS
mandarin version of MQTTS
Language:Python7 1 04
innnky/Bert-VITS2
vits2 backbone with bert
Language:Python4 1 0
innnky/AutoVocoder
Autovocoder: Fast Waveform Generation from a Learned Speech Representation using Differentiable Digital Signal Processing
Language:Python3 1 0
innnky/Diffusion-SVC
Language:Python2 0 0
innnky/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python2 1 0
innnky/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Forked and maintained by the OpenVPI community
Language:Python1 1 0
innnky/edm2
Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)
1
innnky/HierTTS
Language:Python1 2 0
innnky/SoundStorm
The reproduced code for Google's SoundStorm
Language:Python1 1 01
innnky/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Language:Python1 1 0
innnky/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Language:Python1 1 0
innnky/VoiceFlow-TTS
[ICASSP 2024] This is the official code for "VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching"
1
innnky/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
innnky/hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
Language:Python1 0
innnky/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
innnky/rvc
Language:Python1 0
innnky/SimpleSpeech
The open source code for SimpleSpeech series
innnky/SNAC
Unofficial Pytorch implementation of SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech
Language:Python1 0
innnky/SoundStorm-pytorch
Google's SoundStorm: Efficient Parallel Audio Generation
Language:Python1 0
innnky/soundstorm-speechtokenizer
Implementation of SoundStorm built upon SpeechTokenizer.
Language:Python1 0
innnky/stable-audio-tools
Generative models for conditional audio generation
Language:Python
innnky/USLM
Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
Language:Python1 0