DogeFlow

DogeFlow's Stars

fudan-generative-vision/hallo2
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
Language:Python4.5k644
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python13.2k1.4k
jdh-algo/JoyHallo
JoyHallo: Digital human model for Mandarin
Language:Python41043
speechbrain/speechbrain
A PyTorch-based Speech Toolkit
Language:Python9.2k1.4k
waydabber/BetterDisplay
Unlock your displays on your Mac! Flexible HiDPI scaling, XDR/HDR extra brightness, virtual screens, DDC control, extra dimming, PIP/streaming, EDID override and lots more!
21.8k381
Caldis/Mos
一个用于在 macOS 上平滑你的鼠标滚动效果或单独设置滚动方向的小工具, 让你的滚轮爽如触控板 | A lightweight tool used to smooth scrolling and set scroll direction independently for your mouse on macOS
Language:Swift15.1k530
yerfor/MimicTalk
MimicTalk: Mimicking a personalized and expressive 3D talking face in minutes; NeurIPS 2024; Official code
Language:Python51458
monologg/JointBERT
Pytorch implementation of JointBERT: "BERT for Joint Intent Classification and Slot Filling"
Language:Python679186
Henry-23/VideoChat
实时语音交互数字人，支持端到端语音方案（GLM-4-Voice - THG）和级联方案（ASR-LLM-TTS-THG）。可自定义形象与音色，无须训练，支持音色克隆，首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and cascaded solutions (ASR-LLM-TTS-THG). Customizable appearance and voice, supporting voice cloning, with initial package delay as low as 3s.
Language:Python59979
google-research/distilling-step-by-step
Language:Python44766
microsoft/LMOps
General technology for enabling AI capabilities w/ LLMs and MLLMs
Language:Python3.8k286
anliyuan/Ultralight-Digital-Human
一个超轻量级、可以在移动端实时运行的数字人模型
Language:Python1.4k204
jingyaogong/minimind
「大模型」3小时完全从0训练26M的小参数GPT，个人显卡即可推理训练！
Language:Python3.5k436
kyutai-labs/moshi
Language:Python7.1k553
microsoft/WSL
Issues found on WSL
Language:Python17.7k838
antgroup/echomimic
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Language:Python3.4k392
fishaudio/fish-speech
SOTA Open Source TTS
Language:Python18.2k1.4k
nginx/nginx
The official NGINX Open Source repository.
Language:C25.7k7.1k
myshell-ai/MeloTTS
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
Language:Python5.1k689
2noise/ChatTTS
A generative speech model for daily dialogue.
Language:Python33.4k3.6k
YuanxunLu/LiveSpeechPortraits
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation (SIGGRAPH Asia 2021)
Language:Python1.2k215
Nota-NetsPresso/nota-wav2lip
A 28× Compressed Wav2Lip for Efficient Talking Face Generation [ICCV'23 Demo] [MLSys'23 Workshop] [NVIDIA GTC'23]
Language:Python526
Fictionarry/TalkingGaussian
[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
Language:Python29735
EleutherAI/lm-evaluation-harness
A framework for few-shot evaluation of language models.
Language:Python7.4k2k
anothermartz/Easy-Wav2Lip
Colab for making Wav2Lip high quality and easy to use
Language:Jupyter Notebook733120
v3ucn/DCT-Net_Webui
基于DCT-Net的图片/视频转绘gradio界面webui
Language:Jupyter Notebook217
menyifang/DCT-Net
Official implementation of "DCT-Net: Domain-Calibrated Translation for Portrait Stylization", SIGGRAPH 2022 (TOG); Multi-style cartoonization
Language:Jupyter Notebook78977
modelscope/DiffSynth-Studio
Enjoy the magic of Diffusion models!
Language:Python6.7k625
vitalik/django-ninja
💨 Fast, Async-ready, Openapi, type hints based framework for building APIs
Language:Python7.6k449
quic/ai-hub-models
The Qualcomm® AI Hub Models are a collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.
Language:Python55287