kafan1986's Stars
AUTOMATIC1111/stable-diffusion-webui
Stable Diffusion web UI
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
unslothai/unsloth
Finetune Llama 4, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
RVC-Project/Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
Camb-ai/MARS5-TTS
MARS5 speech model (TTS) from CAMB.AI
ebitengine/purego
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
DigitalPhonetics/IMS-Toucan
Controllable and fast Text-to-Speech for over 7000 languages!
aliutkus/speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
gabrielmittag/NISQA
NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment
met4citizen/TalkingHead
Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.
jxmorris12/language_tool_python
a free python grammar checker 📝✅
tijiang13/InstantAvatar
InstantAvatar: Learning Avatars from Monocular Video in 60 Seconds (CVPR 2023)
lochenchou/MOSNet
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
pseeth/torch-stft
An STFT/iSTFT for PyTorch.
fjiang9/NKF-AEC
Acoustic Echo Cancellation with Nerual Kalman Filtering
kraiskil/onnx2c
Open Neural Network Exchange to C compiler.
cpldcpu/BitNetMCU
Neural Networks with low bit weights on low end 32 bit microcontrollers such as the CH32V003 RISC-V Microcontroller and others
davidmartinrius/speech-dataset-generator
🔊 Create labeled datasets, enhance audio quality, identify speakers, support diverse dataset types. 🎧👥📊 Advanced audio processing.
p0p4k/pflowtts_pytorch
Unofficial implementation of NVIDIA P-Flow TTS paper
crazy-max/xgo
Go CGO cross compiler
RoyChao19477/SEMamba
This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)
sh-lee-prml/PeriodWave
The official Implementation of PeriodWave and PeriodWave-Turbo
RevoSpeechTech/speech-datasets-collection
a curated list of speech datasets (110+ datasets, 75+ easy to download)
Choddeok/EmoSphere-TTS
The official implementation of EmoSphere-TTS
AI4Bharat/vistaar
Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
YofarDev/yofardev_ai
SapphireLab/Sapphire-TTS-Collection
enric1994/pose2avatar
Animate a 3D model using Blender and OpenPose
DataSenseiAryan/GoogleSpeechCommandLowFootprint
This repository contains the Code for SOTA model on Google Speech Command V2 dataset.