zexiJia's Stars
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
albarji/mixture-of-diffusers
Mixture of Diffusers for scene composition and high resolution image generation
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
yardenfren1996/B-LoRA
Implicit Style-Content Separation using B-LoRA
TyroneZ3/ARPGrounding
SivanDoveh/DAC
Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models
om-ai-lab/VL-CheckList
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]
facebookresearch/DCI
Densely Captioned Images (DCI) dataset repository.
RoyiRa/Linguistic-Binding-in-Diffusion-Models
mertyg/vision-language-models-are-bows
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023
haoningwu3639/SimpleSDXL
A simple and flexible PyTorch implementation of StableDiffusion-XL based on diffusers.
meta-llama/llama3
The official Meta Llama 3 GitHub site
yatengLG/ISAT_with_segment_anything
Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
kvpratama/gan
Various GAN Model
HaohanWang/ImageNet-Sketch
ImageNet-Sketch data set for evaluating model's ability in learning (out-of-domain) semantics at ImageNet scale
pschaldenbrand/StyleCLIPDraw
Styled text-to-drawing synthesis method. Featured at IJCAI 2022 and the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design
yael-vinker/CLIPasso
uzh-rpg/RVT
Implementation of "Recurrent Vision Transformers for Object Detection with Event Cameras". CVPR 2023
richzhang/PerceptualSimilarity
LPIPS metric. pip install lpips
DWCTOD/CVPR2024-Papers-with-Code-Demo
收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations from everyone!
google-research/maskgit
Official Jax Implementation of MaskGIT
82magnolia/n_imagenet
Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)
zcemycl/Pytorch_Outpainting_SRN
A casual PyTorch implementation of Wide-Context Semantic Image Extrapolation paper
hila-chefer/Transformer-Explainability
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
irfanICMLL/TorchDistiller
irfanICMLL/ETC-Real-time-Per-frame-Semantic-video-segmentation
Enforcing temporal consistency in real-time per-frame semantic video segmentation
uzh-rpg/E-RAFT
uzh-rpg/event-based_vision_resources