tuocheng0824's Stars
yuan1615/AdaVocoder
Adaptive Vocoder for Custom Voice
Ryuk17/SpeechAlgorithms
You can find the speech algorithms you want here
AGENDD/RWKV-ASR
This repo is an exploratory experiment to enable frozen pretrained RWKV language models to accept speech modality input. We followed the idea of SLAM_ASR and used the RWKV language model as the LLM, and instead of directly writing a prompt template we directly finetuned the initial state of the RWKV model.
RicherMans/Dasheng
Source for the Interspeech 2024 Paper "Scaling up masked audio encoder learning for general audio classification"
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
WenzheLiu-Speech/awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
X-LANCE/SLAM-LLM
Speech, Language, Audio, Music Processing with Large Language Model
modelscope/FunClip
Open-source, accurate and easy-to-use video speech recognition & clipping tool, LLM based AI clipping intergrated.
yeyupiaoling/Whisper-Finetune
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment
wenet-e2e/wesignal
Production first, nn-based on-device signal processing toolkit.
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
modelscope/modelscope
ModelScope: bring the notion of Model-as-a-Service to life.
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
wenet-e2e/wetts
Production First and Production Ready End-to-End Text-to-Speech Toolkit
hankcs/HanLP
中文分词 词性标注 命名实体识别 依存句法分析 成分句法分析 语义依存分析 语义角色标注 指代消解 风格转换 语义相似度 新词发现 关键词短语提取 自动摘要 文本分类聚类 拼音简繁转换 自然语言处理
alphacep/vosk-server
WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries
alphacep/vosk-api
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Snowdar/asv-subtools
An Open Source Tools for Speaker Recognition
wildwolf1994411/VGG-Speaker-Recognition
Utterance-level Aggregation For Speaker Recognition In The Wild
llp1992/MachineLearning