shu-le's Stars
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
instantX-research/InstantID
InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
magic-research/magic-animate
[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
TencentARC/PhotoMaker
PhotoMaker [CVPR 2024]
facebookresearch/sapiens
High-resolution models for human tasks.
NUS-HPC-AI-Lab/VideoSys
VideoSys: An easy and efficient system for video generation
THUDM/SwissArmyTransformer
SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.
showlab/Show-o
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
Vchitect/SEINE
[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
KovenYu/WonderJourney
foivospar/Arc2Face
[ECCV 2024 Oral🔥] Arc2Face: A Foundation Model for ID-Consistent Human Faces
liuff19/ReconX
ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model
ali-vilab/FlashFace
ID-Animator/ID-Animator
guanjz20/StyleSync
Official code of CVPR '23 paper "StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator"
aim-uofa/MovieDreamer
zhenzhiwang/HumanVid
Official implementation of HumanVid, NeurIPS D&B Track 2024
jeanne-wang/svd_keyframe_interpolation
RafailFridman/SceneScape
Official Pytorch Implementation for "SceneScape: Text-Driven Consistent Scene Generation"
kyegomez/Vit-RGTS
Open source implementation of "Vision Transformers Need Registers"
ZCMax/LLaVA-3D
A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
vaew/SkyScript-100M
SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2
chen-wl20/DreamCinema
DreamCinema: Cinematic Transfer with Free Camera and 3D Character
QQ-MM/Video-CCAM
A lightweight flexible Video-MLLM developed by TencentQQ Multimedia Research Team.
robincourant/DIRECTOR
WUyinwei-hah/IFAdapter
Official implementation of "IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation".
eckertzhang/HumanRef
tobran/StoryImager
[ECCV2024] StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
baojudezeze/Generative-Virtual-Try-On
Generative virtual try on (VTON), try-on images of characters can be generated by text prompt.
kunyao2015/StyleLipSync
[ICCV 2023] Official pytorch implementation of "StyleLipSync: Style-based Personalized Lip-sync Video Generation".