xiangkanghuang's Stars
yuanzhoulvpi2017/vscode_debug_transformers
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
OpenNMT/CTranslate2
Fast inference engine for Transformer models
lipku/metahuman-stream
Real time interactive streaming digital human
William1617/REAL_TIME_NKF_AEC
fjiang9/NKF-AEC
Acoustic Echo Cancellation with Nerual Kalman Filtering
CanCLID/to-jyutping
粵語拼音自動標註工具 Cantonese Pronunciation Automatic Labeling Tool
AudioLLMs/AudioLLM
Audio Large Language Models
KAIST-MACLab/PyTSMod
An open-source Python library for audio time-scale modification.
microsoft/AudioEntailment
Audio Entailment: Deductive Reasoning for Audio Understanding
ex3ndr/supervoice-hybrid
My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one
k2-fsa/sherpa-onnx
Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
sstzal/DiffTalk
[CVPR2023] The implementation for "DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation"
speechbrain/benchmarks
This repository contains the SpeechBrain Benchmarks
tech-shrimp/docker_image_pusher
使用Github Action将国外的Docker镜像转存到阿里云私有仓库,供国内服务器使用,免费易用
warmshao/FasterLivePortrait
Bring portraits to life in Real Time!onnx/tensorrt support!实时肖像驱动!
lucidrains/rectified-flow-pytorch
Implementation of rectified flow and some of its followup research / improvements in Pytorch
xiaoxiaomiao323/MSA
QwenLM/Qwen2-Audio
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
sai-soum/Diff-MST
Multitrack music mixing style transfer given a reference song using differentiable mixing console.
RoyChao19477/SEMamba
This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
PCSX2/pcsx2
PCSX2 - The Playstation 2 Emulator
AkshathRaghav/tinyspeech
Code release for "TinySpeech: Attention Condensers for Deep Speech Recognition Neural Networks on Edge Devices"
YoMio-Tech-Inc/GPT-SoVITS2
GPT-SoVITS2
BadToBest/EchoMimic
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
PeterH0323/Streamer-Sales
Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁,一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋
zeyuxie29/AudioTime