DogeFlow's Stars
fudan-generative-vision/hallo2
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
jdh-algo/JoyHallo
JoyHallo: Digital human model for Mandarin
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
waydabber/BetterDisplay
Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brightness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!
Caldis/Mos
一个用于在 macOS 上平滑你的鼠标滚动效果或单独设置滚动方向的小工具, 让你的滚轮爽如触控板 | A lightweight tool used to smooth scrolling and set scroll direction independently for your mouse on macOS
yerfor/MimicTalk
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code
monologg/JointBERT
Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"
Henry-23/VideoChat
实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cascaded solutions (ASR-LLM-TTS-THG). Customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.
google-research/distilling-step-by-step
microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
anliyuan/Ultralight-Digital-Human
一个超轻量级、可以在移动端实时运行的数字人模型
jingyaogong/minimind
「大模型」3小时完全从0训练26M的小参数GPT,个人显卡即可推理训练!
kyutai-labs/moshi
microsoft/WSL
Issues found on WSL
antgroup/echomimic
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
fishaudio/fish-speech
SOTA Open Source TTS
nginx/nginx
The official NGINX Open Source repository.
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
2noise/ChatTTS
A generative speech model for daily dialogue.
YuanxunLu/LiveSpeechPortraits
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)
Nota-NetsPresso/nota-wav2lip
A 28× Compressed Wav2Lip for Efficient Talking Face Generation [ICCV'23 Demo] [MLSys'23 Workshop] [NVIDIA GTC'23]
Fictionarry/TalkingGaussian
[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
anothermartz/Easy-Wav2Lip
Colab for making Wav2Lip high quality and easy to use
v3ucn/DCT-Net_Webui
基于DCT-Net的图片/视频转绘gradio界面webui
menyifang/DCT-Net
Official implementation of "DCT-Net: Domain-Calibrated Translation for Portrait Stylization", SIGGRAPH 2022 (TOG); Multi-style cartoonization
modelscope/DiffSynth-Studio
Enjoy the magic of Diffusion models!
vitalik/django-ninja
💨 Fast, Async-ready, Openapi, type hints based framework for building APIs
quic/ai-hub-models
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.