zexiJia

zexiJia's Stars

BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
13.1k835
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook10.1k977
albarji/mixture-of-diffusers
Mixture of Diffusers for scene composition and high resolution image generation
Language:Python41837
YangLing0818/RPG-DiffusionMaster
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
Language:Jupyter Notebook1.7k100
yardenfren1996/B-LoRA
Implicit Style-Content Separation using B-LoRA
Language:Jupyter Notebook31422
TyroneZ3/ARPGrounding
Language:Python2
SivanDoveh/DAC
Repository for the paper: dense and aligned captions (dac) promote compositional reasoning in vl models
Language:Python252
om-ai-lab/VL-CheckList
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]
Language:Python1284
facebookresearch/DCI
Densely Captioned Images (DCI) dataset repository.
Language:Python1625
RoyiRa/Linguistic-Binding-in-Diffusion-Models
Language:Jupyter Notebook7411
mertyg/vision-language-models-are-bows
Experiments and data for the paper "When and why vision-language models behave like bags-of-words, and what to do about it?" Oral @ ICLR 2023
Language:Python26315
haoningwu3639/SimpleSDXL
A simple and flexible PyTorch implementation of StableDiffusion-XL based on diffusers.
Language:Python132
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.5k3.1k
yatengLG/ISAT_with_segment_anything
Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具
Language:Python1.3k143
IDEA-Research/Grounded-Segment-Anything
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Language:Jupyter Notebook15.4k1.4k
kvpratama/gan
Various GAN Model
Language:Python6234
HaohanWang/ImageNet-Sketch
ImageNet-Sketch data set for evaluating model's ability in learning (out-of-domain) semantics at ImageNet scale
Language:Python20716
pschaldenbrand/StyleCLIPDraw
Styled text-to-drawing synthesis method. Featured at IJCAI 2022 and the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design
Language:Jupyter Notebook27916
yael-vinker/CLIPasso
Language:Jupyter Notebook85893
uzh-rpg/RVT
Implementation of "Recurrent Vision Transformers for Object Detection with Event Cameras". CVPR 2023
Language:Python32742
richzhang/PerceptualSimilarity
LPIPS metric. pip install lpips
Language:Python3.7k502
DWCTOD/CVPR2024-Papers-with-Code-Demo
收集 CVPR 最新的成果，包括论文、代码和demo视频等，欢迎大家推荐！Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations from everyone!
1.3k146
google-research/maskgit
Official Jax Implementation of MaskGIT
Language:Jupyter Notebook45950
82magnolia/n_imagenet
Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)
Language:Python535
zcemycl/Pytorch_Outpainting_SRN
A casual PyTorch implementation of Wide-Context Semantic Image Extrapolation paper
Language:Python143
hila-chefer/Transformer-Explainability
[CVPR 2021] Official PyTorch implementation for Transformer Interpretability Beyond Attention Visualization, a novel method to visualize classifications by Transformer based networks.
Language:Jupyter Notebook1.8k241
irfanICMLL/TorchDistiller
Language:Python19324
irfanICMLL/ETC-Real-time-Per-frame-Semantic-video-segmentation
Enforcing temporal consistency in real-time per-frame semantic video segmentation
Language:Python30230
uzh-rpg/E-RAFT
Language:Python11619
uzh-rpg/event-based_vision_resources
2.9k662