WangGewu

WangGewu's Stars

mudler/LocalAI
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement for OpenAI, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. Features: Generate Text, Audio, Video, Images, Voice Cloning, Distributed, P2P inference
Language:Go28k2.1k
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python21.3k2.2k
ScottishFold007/TTSAudioNormalizer
TTSAudioNormalizer is a specialized tool for TTS data production, featuring descriptive statistical analysis of audio loudness and loudness normalization operations.
Language:Python8614
fishaudio/vocoder
Language:Python844
modelscope/ClearerVoice-Studio
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
Language:Python1.9k138
lucadellalib/discrete-wavlm-codec
A neural speech codec based on discrete WavLM representations
Language:Python222
francislata/unicats
An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".
Language:Python241
Jackiexiao/tts-frontend-dataset
TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization
Language:Python9215
Aria-K-Alethia/BigCodec
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
Language:Python1108
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Language:Python51245
Plachtaa/seed-vc
zero-shot voice conversion & singing voice conversion, with real-time support
Language:Python864106
opendilab/CleanS2S
High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体！
Language:Python30729
Hoper-J/AI-Guide-and-Demos-zh_CN
这是一份入门AI/LLM大模型的逐步指南，包含教程和演示代码，带你从API走进本地大模型部署和微调，代码文件会提供Kaggle或Colab在线版本，即便没有显卡也可以进行学习。项目中还开设了一个小型的代码游乐场🎡，你可以尝试在里面实验一些有意思的AI脚本。同时，包含李宏毅 (HUNG-YI LEE）2024生成式人工智能导论课程的完整中文镜像作业。
Language:Python53170
liutaocode/TTS-arxiv-daily
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
Language:Python33822
yoosif0/arabic-tacotron-tts
End to end Arabic TTS system based on tacotron
Language:Python11934
TaoRuijie/ECAPA-TDNN
Unofficial reimplementation of ECAPA-TDNN for speaker recognition (EER=0.86 for Vox1_O when train only in Vox2)
Language:Python626115
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python38.3k6.2k
THUDM/GLM-4-Voice
GLM-4-Voice | 端到端中英语音对话模型
Language:Python2.5k205
BytedanceSpeech/seed-tts-eval
Language:Python1.1k109
NVIDIA/BigVGAN
Official PyTorch implementation of BigVGAN (ICLR 2023)
Language:Python931111
svc-develop-team/so-vits-svc
SoftVC VITS Singing Voice Conversion
Language:Python26.3k4.9k
vivian556123/NeurIPS2024-CoVoMix
Official repo for CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
Language:Python412
XinhaoMei/WavCaps
This reporsitory contains metadata of WavCaps dataset and codes for downstream tasks.
Language:Python21212
Rikorose/DeepFilterNet
Noise supression using deep filtering
Language:Python2.6k244
SWivid/F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Language:Python8.7k1.1k
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Language:Python38.4k4.3k
FireRedTeam/FireRedTTS
An Open-Sourced LLM-empowered Foundation TTS System
Language:Python51937
haoheliu/versatile_audio_super_resolution
Versatile audio super resolution (any -> 48kHz) with AudioSR.
Language:Python1.2k125
kyutai-labs/moshi
Language:Python7.1k555
thuhcsi/SECap
Language:Python14913