xiangkanghuang

xiangkanghuang's Stars

yuanzhoulvpi2017/vscode_debug_transformers
Language:Python17618
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python5.2k369
OpenNMT/CTranslate2
Fast inference engine for Transformer models
Language:C++3.2k286
lipku/metahuman-stream
Real time interactive streaming digital human
Language:Python3.5k489
William1617/REAL_TIME_NKF_AEC
Language:C++72
fjiang9/NKF-AEC
Acoustic Echo Cancellation with Nerual Kalman Filtering
Language:HTML22058
CanCLID/to-jyutping
粵語拼音自動標註工具 Cantonese Pronunciation Automatic Labeling Tool
Language:TypeScript103
AudioLLMs/AudioLLM
Audio Large Language Models
623
KAIST-MACLab/PyTSMod
An open-source Python library for audio time-scale modification.
Language:Python19227
microsoft/AudioEntailment
Audio Entailment: Deductive Reasoning for Audio Understanding
101
ex3ndr/supervoice-hybrid
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
Language:Jupyter Notebook272
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
Language:C++3.2k371
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Language:Jupyter Notebook4.7k620
sstzal/DiffTalk
[CVPR2023] The implementation for "DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation"
Language:Python43841
speechbrain/benchmarks
This repository contains the SpeechBrain Benchmarks
Language:Python8835
tech-shrimp/docker_image_pusher
使用Github Action将国外的Docker镜像转存到阿里云私有仓库，供国内服务器使用，免费易用
1.5k8.6k
warmshao/FasterLivePortrait
Bring portraits to life in Real Time！onnx/tensorrt support！实时肖像驱动！
Language:Python46043
lucidrains/rectified-flow-pytorch
Implementation of rectified flow and some of its followup research / improvements in Pytorch
Language:Python1472
xiaoxiaomiao323/MSA
Language:Jupyter Notebook13
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Language:Python1.1k65
sai-soum/Diff-MST
Multitrack music mixing style transfer given a reference song using differentiable mixing console.
Language:Jupyter Notebook271
RoyChao19477/SEMamba
This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)
Language:Python11811
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.5k1.2k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python27.1k4k
PCSX2/pcsx2
PCSX2 - The Playstation 2 Emulator
Language:C++11.5k1.6k
AkshathRaghav/tinyspeech
Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"
Language:Python102
YoMio-Tech-Inc/GPT-SoVITS2
GPT-SoVITS2
Language:Python16913
BadToBest/EchoMimic
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Language:Python2.5k290
PeterH0323/Streamer-Sales
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋
Language:Python2.4k341
zeyuxie29/AudioTime
Language:Python22