Wuziyi616's Stars
TonyLianLong/LLM-groundedVideoDiffusion
[ICLR 2024] LLM-grounded Video Diffusion Models (LVD): official implementation for the LVD paper
jiaxilv/GPT4Motion
Vchitect/VBench
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
Vchitect/SEINE
[ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
songweige/TATS
Official PyTorch implementation of TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer (ECCV 2022)
songweige/content-debiased-fvd
[CVPR 2024] On the Content Bias in Fréchet Video Distance
google/storybench
Breakthrough/PySceneDetect
:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.
OpenBMB/MiniCPM-V
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
a1600012888/PhysDreamer
Code for PhysDreamer
uzh-rpg/dagr
Code for the paper "Low Latency Automotive Vision with Event Cameras", published in Nature
lllyasviel/Omost
Your image is almost there!
bcmi/Object-Shadow-Generation-Dataset-DESOBAv2
[CVPR 2024] The dataset, code, and model for our paper "Shadow Generation for Composite Image Using Diffusion Model", CVPR, 2024.
NVlabs/DoRA
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
THUDM/CogVLM2
GPT4V-level open-source multi-modal model based on Llama3-8B
guanyingc/PS-FCN_Poster_LaTex
LaTex Poster for PS-FCN (ECCV 2018)
toshas/torch-fidelity
High-fidelity performance metrics for generative models in PyTorch
NVlabs/edm2
Analyzing and Improving the Training Dynamics of Diffusion Models (EDM2)
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
ziqihuangg/Awesome-Evaluation-of-Visual-Generation
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
CLAY-3D/OpenCLAY
CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets
justimyhxu/GRM
Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
mbanani/probe3d
[CVPR 2024] Probing the 3D Awareness of Visual Foundation Models
google/flaxformer
meta-llama/llama3
The official Meta Llama 3 GitHub site
FoundationVision/VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
SunzeY/AlphaCLIP
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
jasonyzhang/RayDiffusion
Code for "Cameras as Rays"
prs-eth/Marigold
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation