xiaoqian-shen's Stars
timothybrooks/instruct-pix2pix
showlab/Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
Doubiiu/DynamiCrafter
[ECCV 2024] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
dvlab-research/LISA
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
baaivision/Emu
Emu Series: Generative Multimodal Models from BAAI
omerbt/TokenFlow
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)
Fantasy-Studio/Paint-by-Example
Paint by Example: Exemplar-based Image Editing with Diffusion Models
Vchitect/SEINE
[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
Vchitect/LaVie
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
castorini/daam
Diffusion attentive attribution maps for interpreting Stable Diffusion.
ExponentialML/Text-To-Video-Finetuning
Finetune ModelScope's Text To Video model using Diffusers 🧨
allenai/unified-io-2
AILab-CVC/SEED
Official implementation of SEED-LLaMA (ICLR 2024).
YingqingHe/LVDM
LVDM: Latent Video Diffusion Models for High-Fidelity Long Video Generation
kohjingyu/gill
🐟 Code and models for the NeurIPS 2023 paper "Generating Images with Multimodal Language Models".
dvlab-research/Video-P2P
Video-P2P: Video Editing with Cross-attention Control
AILab-CVC/FreeNoise
[ICLR 2024] Code for FreeNoise based on VideoCrafter
AILab-CVC/TaleCrafter
[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters
jamespark3922/visual-comet
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
genforce/StyleSV
[ICLR 2023] Towards Smooth Video Composition
bytedance/Shot2Story
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
google/storybench
ubc-vision/Make-A-Story
Code Release for the paper "Make-A-Story: Visual Memory Conditioned Consistent Story Generation" in CVPR 2023
xiaoqian-shen/StoryGPT-V
adymaharana/StoryViz
adymaharana/VLCStoryGan
Official code repository for the EMNLP 2021 paper
princetonvisualai/pointingqa
Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"
yonseivnl/cmota
ali-vilab/i2vgen-xl