jc1888822's Stars
eric-ai-lab/Screen-Point-and-Read
Code repo for "Read Anywhere Pointed: Layout-aware GUI Screen Reading with Tree-of-Lens Grounding"
ZJULiHongxin/AutoGUI
The official implementation of AutoGUI.
showlab/Awesome-GUI-Agent
💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
njucckevin/SeeClick
The model, data and code for the visual GUI Agent SeeClick
ultralytics/yolov5
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
kdwonn/SaG
Official repository of the "Shatter and Gather: Learning Referring Image Segmentation with Text Supervision (ICCV'23)"
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
amusi/AI-Job-Notes
AI算法岗求职攻略(涵盖准备攻略、刷题指南、内推和AI公司清单等资料)
zjh31/CPL
MarkMoHR/Awesome-Referring-Image-Segmentation
:books: A collection of papers about Referring Image Segmentation.
clownrat6/Out-of-Candidate-Rectification
[cvpr2023] implementation of out-of-candidate rectification methods
muyangyi/SimSeg
[CVPR'23] A Simple Framework for Text-Supervised Semantic Segmentation
Jazzcharles/OVSegmentor
OVSegmentor, CVPR23
khanrc/tcl
Official implementation of TCL (CVPR 2023)
Vibashan/Mask-free-OVIS
Official Pytorch codebase for Open-Vocabulary Instance Segmentation without Manual Mask Annotations [CVPR 2023]
YuLiu-LY/BO-QSA
This repository is the official implementation of Improving Object-centric Learning With Query Optimization
linyq2117/CLIP-ES
SooLab/CGFormer
The official PyTorch implementation of the CVPR 2023 paper "Contrastive Grouping with Transformer for Referring Image Segmentation".
fawnliu/TRIS
[ICCV 2023] Official code release of our paper "Referring Image Segmentation Using Text Supervision"
rulixiang/ToCo
[CVPR 2023] Token Contrast for Weakly-Supervised Semantic Segmentation
linhuixiao/CLIP-VG
[TMM 2023] Self-paced Curriculum Adapting of CLIP for Visual Grounding.
codezakh/SIMLA
[ECCV 22] Single Stream Multi-Level Alignment for Vision Language Pretraining
zjukg/DUET
[Paper][AAAI 2023] DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning
lezhang7/Enhance-FineGrained
[CVPR' 2024] Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding
THU-MIG/Consolidator
Official implementation for ICLR 2023 paper Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation
modestyachts/ImageNetV2_pytorch
ImageNetV2 Pytorch Dataset
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
X-PLUG/mPLUG-2
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
zhangxinsong-nlp/XFM
source code for XFM, a general foundation model for language, vision, and vision-language understanding