ChiRenjun's Stars
lovemefan/SenseVoice.cpp
Port of Funasr's Sense-voice model in C/C++
huggingface/speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
facebookresearch/seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Yuan-ManX/ai-audio-datasets
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications.
s3prl/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
ufal/whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
DongKeon/Awesome-Speaker-Diarization
Some comprehensive papers about speaker diarization
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
linto-ai/whisper-timestamped
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
QwenLM/Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
ggerganov/llama.cpp
LLM inference in C/C++
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
huggingface/parler-tts
Inference and training library for high-quality TTS models.
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
mkunes/w2v2_audioFrameClassification
wav2vec2 audio classification for prosodic boundary detection and other tasks
yeyupiaoling/MASR
Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
ChristopherGS/ultimate-fastapi-tutorial
The Ultimate FastAPI Tutorial
UnicomAI/Unichat-llama3-Chinese
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
modelscope/3D-Speaker
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
LlamaFamily/Llama-Chinese
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
wq2012/awesome-diarization
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding