tuofeilunhifi's Stars
InstantID/InstantID
InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥
TencentARC/PhotoMaker
PhotoMaker
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
megvii-research/ECCV2022-RIFE
ECCV2022 - Real-Time Intermediate Flow Estimation for Video Frame Interpolation
guoqincode/Open-AnimateAnyone
Unofficial Implementation of Animate Anyone
MooreThreads/Moore-AnimateAnyone
Character Animation (AnimateAnyone, Face Reenactment)
hustvl/Vim
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
IDEA-Research/T-Rex
API for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
yformer/EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
MzeroMiko/VMamba
VMamba: Visual State Space Models,code is based on mamba
pytorch-labs/segment-anything-fast
A batched offline inference oriented version of segment-anything
Flode-Labs/vid2densepose
Convert your videos to densepose and use it on MagicAnimate
FoundationVision/GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
yyyujintang/Awesome-Mamba-Papers
Awesome Papers related to Mamba.
lxtGH/OMG-Seg
[CVPR-2024] One Model For Image/Video/Instractive/Open-Vocabulary Segmentation
CircleRadon/Osprey
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
apple/ml-aim
This repository provides the code and model checkpoints of the research paper: Scalable Pre-training of Large Autoregressive Image Models
alibaba/animate-anything
Fine-Grained Open Domain Image Animation with Motion Guidance
LAION-AI/CLIP_benchmark
CLIP-like model evaluation
baaivision/tokenize-anything
Tokenize Anything via Prompting
pixeli99/SVD_Xtend
Stable Video Diffusion Training Code and Extensions.
CiaraStrawberry/svd-temporal-controlnet
OpenGVLab/unmasked_teacher
[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
xushilin1/RAP-SAM
Atomic-man007/Awesome_Multimodel_LLM
Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.
yuweihao/MM-Vet
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
diffusion-motion-transfer/diffusion-motion-transfer
Official Pytorch Implementation for "Space-Time Diffusion Features for Zero-Shot Text-Driven Motion Transfer""
opendatalab/laion5b-downloader
xzz2/pa-sam
PA-SAM: Prompt Adapter SAM for High-quality Image Segmentation
wusize/CLIM
[AAAI2024] Code Release of CLIM: Contrastive Language-Image Mosaic for Region Representation