kafan1986

kafan1986's Stars

AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
Language:Python151k 1.1k 7.8k28.1k
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python43.9k 233 1.7k4.9k
unslothai/unsloth
Finetune Llama 4, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
Language:Python36.9k 216 1.8k2.9k
RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
Language:Python28.5k 190 1.8k4k
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
Language:Jupyter Notebook2.6k 36 51219
ebitengine/purego
Language:Go2.6k 25 11181
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Language:Python2.4k 44 401497
DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
Language:Python1.6k 22 171180
aliutkus/speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
Language:Python954 21 34164
gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
Language:Python758 25 51129
met4citizen/TalkingHead
Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.
Language:JavaScript507 16 84152
jxmorris12/language_tool_python
a free python grammar checker 📝✅
Language:Python457 9 7965
tijiang13/InstantAvatar
InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds (CVPR 2023)
Language:Python392 14 8333
lochenchou/MOSNet
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
Language:Python358 10 1065
pseeth/torch-stft
An STFT/iSTFT for PyTorch.
Language:Python356 6 1452
fjiang9/NKF-AEC
Acoustic Echo Cancellation with Nerual Kalman Filtering
Language:HTML284 9 2465
kraiskil/onnx2c
Open Neural Network Exchange to C compiler.
Language:C272 10 4142
cpldcpu/BitNetMCU
Neural Networks with low bit weights on low end 32 bit microcontrollers such as the CH32V003 RISC-V Microcontroller and others
Language:C264 14 428
davidmartinrius/speech-dataset-generator
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.
Language:Python243 16 1423
p0p4k/pflowtts_pytorch
Unofficial implementation of NVIDIA P-Flow TTS paper
Language:Python222 14 4332
crazy-max/xgo
Go CGO cross compiler
Language:Shell213 3 3828
RoyChao19477/SEMamba
This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)
Language:Python189 13 2122
sh-lee-prml/PeriodWave
The official Implementation of PeriodWave and PeriodWave-Turbo
Language:Python182 35 611
RevoSpeechTech/speech-datasets-collection
a curated list of speech datasets (110+ datasets, 75+ easy to download)
128 8 24
Choddeok/EmoSphere-TTS
The official implementation of EmoSphere-TTS
Language:Python112 6 513
AI4Bharat/vistaar
Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
Language:Python49 4 718
YofarDev/yofardev_ai
Language:Dart41 1 00
SapphireLab/Sapphire-TTS-Collection
Language:Python29 3 00
enric1994/pose2avatar
Animate a 3D model using Blender and OpenPose
Language:Python19 3 04
DataSenseiAryan/GoogleSpeechCommandLowFootprint
This repository contains the Code for SOTA model on Google Speech Command V2 dataset.
Language:Python14 3 10