Wykay's Stars
google-research/tuning_playbook
A playbook for systematically maximizing the performance of deep learning models.
Stability-AI/generative-models
Generative Models by Stability AI
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
microsoft/Swin-Transformer
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
triton-lang/triton
Development repository for the Triton language and compiler
BradyFU/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Advances on Multimodal Large Language Models
microsoft/LoRA
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
apple/ml-ferret
IDEA-Research/GroundingDINO
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
IDEA-Research/DINO
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
facebookresearch/Detic
Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".
ytongbai/LVM
Hedlen/awesome-segment-anything
Tracking and collecting papers/projects/others related to Segment Anything.
IDEA-Research/awesome-detection-transformer
Collect some papers about transformer for detection and segmentation. Awesome Detection Transformer for Computer Vision (CV)
facebookresearch/MetaCLIP
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
sfzhang15/ATSS
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection, CVPR, Oral, 2020
luo3300612/Visualizer
assistant tools for attention visualization in deep learning
airsplay/lxmert
PyTorch code for EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers".
henghuiding/ReLA
[CVPR2023 Highlight] GRES: Generalized Referring Expression Segmentation
OpenGVLab/all-seeing
[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"
amazon-science/bigdetection
BigDetection: A Large-scale Benchmark for Improved Object Detector Pre-training
lucidrains/memory-efficient-attention-pytorch
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
jozhang97/DETA
Detection Transformers with Assignment
alirezazareian/ovr-cnn
A new framework for open-vocabulary object detection, based on maskrcnn-benchmark
xk-huang/segment-caption-anything
[CVPR 24] The repository provides code for running inference and training for "Segment and Caption Anything" (SCA) , links for downloading the trained model checkpoints, and example notebooks / gradio demo that show how to use the model.
198808xc/Vision-AGI-Survey
A temporary webpage for our survey in AGI for computer vision
jianzongwu/betrayed-by-captions
(ICCV 2023) Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
yangbang18/MultiCapCLIP
(ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
TencentARC/FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
xk-huang/Promptable-GRiT
Promptable GRiT: support inference with both automatic proposal generation and custom point/box prompts.