ailingzengzzz's Stars
yiyuzhuang/IDOL
Website
LLaVA-VL/LLaVA-NeXT
RenShuhuai-Andy/TimeChat
[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
IamCreateAI/Ruyi-Models
Tencent/HunyuanVideo
HunyuanVideo: A Systematic Framework For Large Video Generation Model
antgroup/echomimic_v2
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Francis-Rings/StableAnimator
We present StableAnimator, the first end-to-end ID-preserving video diffusion framework, which synthesizes high-quality videos without any post-processing, conditioned on a reference image and a sequence of poses.
IDEA-Research/ChatRex
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
NVIDIA/Cosmos-Tokenizer
A suite of image and video neural tokenizers
WangWenhao0716/TIP-I2V
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
VideoVerses/VideoTuna
Let's finetune video generation models!
etched-ai/open-oasis
Inference script for Oasis 500M
rhymes-ai/Allegro
Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.
facebookresearch/MovieGenBench
Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen
brentyi/egoallo
Estimating Body and Hand Motion in an Ego-sensed World
Yukun-Huang/DreamWaltz-G
Official implementation of the paper "DreamWaltz-G: Expressive 3D Gaussian Avatars from Skeleton-Guided 2D Diffusion".
AILab-CVC/VideoGen-Eval
The Dawn of Video Generation: Preliminary Explorations with SORA-like Models
NVlabs/ProtoMotions
aigc-apps/CogVideoX-Fun
📹 A more flexible CogVideoX that can generate videos at any resolution and creates videos from images.
zju3dv/GVHMR
Code for "GVHMR: World-Grounded Human Motion Recovery via Gravity-View Coordinates", Siggraph Asia 2024
xinchengshuai/Awesome-Image-Editing
A Survey of Image Editing
wyhuai/SkillMimic
Official code release for the paper "SkillMimic: Learning Reusable Basketball Skills from Demonstrations"
facebookresearch/sapiens
High-resolution models for human tasks.
cure-lab/MotionCraft
Official repo for paper "[AAAI'25] MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls"
LTH14/mar
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
IDEA-Research/Grounded-SAM-2
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
shad0wta9/meshavatar
Code Repository for MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos (ECCV 2024)
zhenzhiwang/HumanVid
[NeurIPS D&B Track 2024] Official implementation of HumanVid
mayuelala/FollowYourEmoji
[Siggraph Asia 2024] Follow-Your-Emoji: This repo is the official implementation of "Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation"
TimoBolkart/FLAME-Universe
Summary of publicly available ressources such as code, datasets, and scientific papers for the FLAME 3D head model