939163156's Stars
hacksider/Deep-Live-Cam
real time face swap and one-click video deepfake with only a single image
ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
lllyasviel/Omost
Your image is almost there!
immortalwrt/immortalwrt
An opensource OpenWrt variant for mainland China users.
Ucas-HaoranWei/GOT-OCR2.0
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
google-deepmind/alphageometry
gpt-omni/mini-omni
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
BadToBest/EchoMimic
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
imsyy/SPlayer
🎉 一个简约的音乐播放器,支持逐字歌词,下载歌曲,展示评论区,音乐云盘及歌单管理,音乐频谱,移动端基础适配 | 网易云音乐 | A minimalist music player
233boy/Xray
最好用的 Xray 一键安装脚本 & 管理脚本
Moriafly/DsoMusic
Kotlin 开发的美观安卓音乐软件,音源:网易云音乐、QQ 音乐、酷我音乐、Bilibili
microsoft/DNS-Challenge
This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.
k2-fsa/icefall
Spr-Aachen/Easy-Voice-Toolkit
可本地部署的AI语音工具箱 | A user-friendly audio toolkit for voice recognition, voice transcription, voice conversion etc.
Vchitect/VBench
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
wenet-e2e/WeTextProcessing
Text Normalization & Inverse Text Normalization
glomatico/spotify-web-downloader
A Python CLI app for downloading songs and music videos directly from Spotify.
aharley/pips2
PIPs++
jrgillick/laughter-detection
k2-fsa/libriheavy
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
lovemefan/SenseVoice.cpp
Port of Funasr's Sense-voice model in C/C++
yhsj0919/music_api
音乐api,myfreemp3,bd,kg,kw,mg,qq,wy,一网打尽
dspearson/librespot-auth
NSoiffer/MathCAT
MathCAT: Math Capable Assistive Technology for generating speech, braille, and navigation.
WUyinwei-hah/IFAdapter
Official implementation of "IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation".
jingzhunxue/flow_mirror
flow mirror models from JZX AI Labs
tsengwoody/Access8Math
Allows access math content written by MathML ; Assist writing math content by LaTeX
skeskinen/resemble-denoise-onnx-inference
Inference of resemble denoiser
Happenmass/LiveAssistPro
LiveAssistPro is an AI assistant for live streaming that uses Zhipu AI’s vision model to analyze screen content and return JSON descriptions. It detects user speech via VAD and ASR, enabling real-time interaction with an AI role-playing assistant, which adapts responses based on both screen visuals and spoken input for dynamic engagement.