jianlong-yuan
Interested in Dense Prediction, such as Depth Estimation and Semantic Segmentation
Alibaba-DAMObeijing
jianlong-yuan's Stars
Stability-AI/generative-models
Generative Models by Stability AI
mlfoundations/open_clip
An open source implementation of CLIP.
guoyww/AnimateDiff
Official implementation of AnimateDiff.
RayVentura/ShortGPT
🚀🎬 ShortGPT - Experimental AI framework for youtube shorts / tiktok channel automation
PixArt-alpha/PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
omerbt/TokenFlow
Official Pytorch Implementation for "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" presenting "TokenFlow" (ICLR 2024)
DjangoPeng/openai-quickstart
A comprehensive guide to understanding and implementing large language models with hands-on examples using LangChain for GenAI applications.
facebookresearch/MetaCLIP
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
xiaobai1217/Awesome-Video-Datasets
Video datasets
showlab/Show-1
[IJCV] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
MC-E/DragonDiffusion
ICLR 2024 (Spotlight)
rese1f/MovieChat
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
segmind/distill-sd
Segmind Distilled diffusion
iejMac/video2dataset
Easily create large video dataset from video urls
RaymondWang987/NVDS
ICCV 2023 "Neural Video Depth Stabilizer" (NVDS) & TPAMI 2024 "NVDS+: Towards Efficient and Versatile Neural Stabilizer for Video Depth Estimation" (NVDS+)
ziqihuangg/ReVersion
[SIGGRAPH Asia 2024] ReVersion: Diffusion-Based Relation Inversion from Images
microsoft/XPretrain
Multi-modality pre-training
forence/Awesome-Visual-Captioning
This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP
VQAssessment/DOVER
[ICCV 2023, Official Code] for paper "Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives". Official Weights and Demos provided.
cure-lab/PnPInversion
[ICLR2024] Official repo for paper "PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code"
OPPO-Mente-Lab/Subject-Diffusion
Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
showlab/all-in-one
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
showlab/EgoVLP
[NeurIPS 2022] Egocentric Video-Language Pretraining
facebookresearch/ActivityNet-Entities
A Dataset for Grounded Video Description
jiaxilv/GPT4Motion
tgc1997/Awesome-Video-Captioning
A curated list of research papers in Video Captioning
simon3dv/SLR-SFS
Code release for the paper "Simulating Fluids in Real-World Still Images"
liveseongho/Awesome-Video-Language-Understanding
A Survey on video and language understanding.
kyegomez/Gen1
My Implementation of " Structure and Content-Guided Video Synthesis with Diffusion Models" by RunwayML