qiansichong's Stars
homebrewltd/ichigo
Llama3.1 learns to Listen
yangdongchao/RSTnet
Real-time Speech-Text Foundation Model Toolkit (wip)
xinchen-ai/Westlake-Omni
wwbin2017/bailing
百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,时延低至800ms,低配置也可运行,支持打断
kyutai-labs/moshi
hsiehjackson/ASR-wav2vec2.0
This repo is for zh-TW ASR with wav2vec2.0.
OpenMOSS/AnyGPT
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
thu-coai/CDial-GPT
A Large-scale Chinese Short-Text Conversation Dataset and Chinese pre-training dialog models
0nutation/SpeechGPT
SpeechGPT Series: Speech Large Language Models
YouTaoBaBa/Chinese-Dialogue-Dataset
用于汇总目前的开源中文对话数据集
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
karpathy/LLM101n
LLM101n: Let's build a Storyteller
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Chivier/easy-gpt4o
Easy-GPT4O opensource version
svpino/alloy-voice-assistant
panyanyany/Awesome-ChatTTS
ChatTTS资源大全,免费体验地址,音色库等
BasedHardware/OpenGlass
Turn any glasses into AI-powered smart glasses
6drf21e/ChatTTS_colab
🚀 一键部署(含离线整合包)!基于 ChatTTS ,支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用,无需复杂安装。
libukai/Awesome-ChatTTS
官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project
2noise/ChatTTS
A generative speech model for daily dialogue.
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
wdndev/llama3-from-scratch-zh
从零实现一个 llama3 中文版
sonos/keyword-spotting-research-datasets
aishoot/Sound_Localization_Algorithms
Classical algorithms of sound source localization with beamforming, TDOA and high-resolution spectral estimation.
MarkFzp/act-plus-plus
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
DavidDiazGuerra/gpuRIR
Python library for Room Impulse Response (RIR) simulation with GPU acceleration
GAMMA-UMD/pygsound
Impulse response generation based on state-of-the-art geometric sound propagation engine.
noahzhy/NSNet2
TF, PyTorch implementation of the paper NSNet2
Okrio/CRUSE
a lightweight network for monaural speech enhancement
crlandsc/BS-RoFormer
Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs