wangyang2014

wangyang2014's Stars

CMU-Perceptual-Computing-Lab/openpose
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
Language:C++31.6k 925 2k7.9k
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python21.3k 213 3982.2k
deepseek-ai/DeepSeek-V3
Language:Python17.8k 135 861.4k
Tencent/HunyuanVideo
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Language:Python7.2k 81 132547
yangchris11/samurai
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
Language:Python6.3k 52 77386
open-mmlab/mmpose
OpenMMLab Pose Estimation Toolbox and Benchmark.
Language:Python6k 56 1.5k1.3k
ZheC/Realtime_Multi-Person_Pose_Estimation
Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)
Language:Jupyter Notebook5.1k 258 2361.4k
GuyTevet/motion-diffusion-model
The official PyTorch implementation of the paper "Human Motion Diffusion Model"
Language:Python3.2k 71 221354
Lightricks/LTX-Video
Official repository for LTX-Video
Language:Python2.5k 33 66201
antgroup/echomimic_v2
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Language:Python2.2k 34 113258
OpenGVLab/InternVideo
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Language:Python1.5k 27 20194
Text-to-Audio/AudioLCM
PyTorch Implementation of AudioLCM (ACM-MM'24): a efficient and high-quality text-to-audio generation with latent consistency model.
Language:Python1.2k 138 13181
kakaobrain/rq-vae-transformer
The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)
Language:Jupyter Notebook820 16 2390
affige/genmusic_demo_list
a list of demo websites for automatic music generation research
648 35 844
ivcylc/OpenMusic
OpenMusic: SOTA Text-to-music (TTM) Generation
Language:Python521 10 1450
SeanChenxy/Hand3DResearch
Language:Python474 32 545
yzhang2016/video-generation-survey
A reading list of video generation
466 31 131
hugofloresgarcia/vampnet
music generation with masked transformers!
Language:Python315 9 3637
ai4r/Gesture-Generation-from-Trimodal-Context
Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity (SIGGRAPH Asia 2020)
Language:Python250 9 5935
DiffPoseTalk/DiffPoseTalk
DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Language:Python232 28 2322
yhw-yhw/SHOW
This is the codebase for SHOW in Generating Holistic 3D Human Motion from Speech [CVPR2023],
Language:Python225 4 3728
YanzuoLu/CFLD
[CVPR 2024 Highlight] Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
Language:Jupyter Notebook204 6 4213
ZhengdiYu/Arbitrary-Hands-3D-Reconstruction
🔥(CVPR 2023) ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand Reconstruction
Language:Python197 8 3018
CNChTu/FCPE
Language:Python109 5 619
Frank-ZY-Dou/EMDM
Language:Python82 6 53
X-E-Speech/X-E-Speech-code
X-E-Speech: Joint Training Framework of Non-Autoregressive Cross-lingual Emotional Text-to-Speech and Voice Conversion
Language:Python77 8 49
thuhcsi/S2G-MDDiffusion
Language:Python73 1 142
TIGER-AI-Lab/VideoScore
official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]
Language:Python64 2 52
PKBHY/WaveFM
WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
Language:Python34 2 24
ffxzh/KMTalk
[ECCV2024 offical]KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding
22 5 11