SzczesnyS

SzczesnyS's Stars

RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python42.7k 236 1.6k4.8k
lllyasviel/ControlNet
Let us control diffusion models!
Language:Python31.8k 222 5642.8k
deezer/spleeter
Deezer source separation library including pretrained models.
Language:Python26.5k 390 7862.9k
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Language:Python10.5k 100 5601.4k
miss-mumu/developer2gwy
公务员从入门到上岸，最佳程序员公考实践教程
9.5k 83 80822
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Language:Python6k 36 630516
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python5.8k 44 221773
huggingface/parler-tts
Inference and training library for high-quality TTS models.
Language:Python5.1k 56 142542
lixin4ever/Conference-Acceptance-Rate
Acceptance rates for the major AI conferences
Language:Jupyter Notebook4.4k 129 29306
riffusion/riffusion-hobby
Stable diffusion for real-time music generation
Language:Python3.6k 41 96421
THUDM/GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
Language:Python2.8k 30 143226
PlayVoice/whisper-vits-svc
Core Engine of Singing Voice Conversion & Singing Voice Clone
Language:Python2.7k 30 169921
xunhuang1995/AdaIN-style
Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
Language:Lua1.5k 43 34195
SUC-DriverOld/so-vits-svc-Deployment-Documents
So-VITS-SVC 本地部署使用帮助文档，提供Colab笔记本 So-VITS-SVC Local Deployment Document and provide Colab notebook
Language:Jupyter Notebook695 4 16108
Tele-AI/TeleSpeech-ASR
Language:Python656 17 5859
facebookresearch/textlesslib
Library for Textless Spoken Language Processing
Language:Python536 16 2454
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Language:Python534 16 2349
lucidrains/e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
Language:Python449 26 2644
CNChTu/Diffusion-SVC
Language:Python440 9 2962
hche11/VGGSound
VGGSound: A Large-scale Audio-Visual Dataset
Language:Python307 6 1833
jrgillick/laughter-detection
Language:Python253 11 1351
EmilianPostolache/stable-audio-controlnet
Fine-tune Stable Audio Open with DiT ControlNet.
Language:Python205 4 65
xinchen-ai/Westlake-Omni
Language:Python190 7 1018
KunZhou9646/Mixed_Emotions
Language:Python115 4 311
ejhumphrey/minst-dataset
Music INSTrument dataset
Language:Jupyter Notebook61 4 2610
iiscleap/ZEST
Zero-Shot Emotion Style Transfer
Language:Python42 6 108
zachary-shah/riff-cnet
Controlled audio inpainting using SD-fine tuned model Riffusion in a ControlNet Architecture
Language:Jupyter Notebook281
ilpoviertola/V-AURA
The official implementation of V-AURA: Temporally Aligned Audio for Video with Autoregression (ICASSP 2025)
Language:Python19 2 21
d3n7/riffusionPrepper
Prepare spectrograms from audio for training a Riffusion model
Language:Python14 1 11
dhivyasreedhar/Music-Instrument-Recognition
A Convolutional Neural Network and a K nearest neighbour based classifier to detect the musical instrument present in a given audio file. It can be used for monophonic files. Both classifiers performed well with accuracy above 90%
Language:Jupyter Notebook8 2 02