alice-cool

alice-cool's Stars

facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook48.5k 313 6825.7k
jianzongwu/Awesome-Open-Vocabulary
(TPAMI 2024) A Survey on Open Vocabulary Learning
878 26 1450
liliu-avril/Awesome-Segment-Anything
This repository is for the first comprehensive survey on Meta AI's Segment Anything Model (SAM).
875 22 758
TencentARC/SEED-Story
SEED-Story: Multimodal Long Story Generation with Large Language Model
Language:Python779 15 3258
AIDC-AI/Ovis
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
Language:Python592 7 4233
sstary/SSRS
Language:Python377 3 8143
zhengli97/Awesome-Prompt-Adapter-Learning-for-VLMs
A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.
377 9 315
imagegridworth/IG-VLM
Language:Python131 4 85
zhiweihu1103/AgriMa
后稷-首个开源中文农业大模型
Language:Python124 1 721
Hzzone/PseCo
(CVPR 2024) Point, Segment and Count: A Generalized Framework for Object Counting
Language:Jupyter Notebook104 4 227
Junjue-Wang/EarthVQA
[AAAI 2024] EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering
Language:Python95 3 322
WHB139426/Grounded-Video-LLM
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Language:Python75 4 54
amazon-science/QA-ViT
Language:Python61 5 107
BioMedIA-MBZUAI/MedPromptX
Language:Jupyter Notebook59 2 12
StephenApX/UCD-SCM
[IGARSS 2024] Segment Change Model (SCM) for Unsupervised Change detection in VHR Remote Sensing Images: a Case Study of Buildings
Language:Python45 2 52
jinlHe/PeFoMed
The code for paper: PeFoM-Med: Parameter Efficient Fine-tuning on Multi-modal Large Language Models for Medical Visual Question Answering
Language:Python36 3 42
JiajiaLi04/Agriculture-Foundation-Models
Foundation models & LLMs
33 3 02
wchh-2000/SAMPolyBuild
Adapting the Segment Anything Model for Polygonal Building Extraction
Language:Python25 2 51
rabiulcste/vqazero
visual question answering prompting recipes for large vision-language models
Language:Python23 2 43
yzygit1230/SCD-SAM
Language:Python18 1 51
Lans1ng/PointSAM
[TGRS2025] Code for "PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing Images"
Language:Python17 2 20
matthewdm0816/BridgeQA
[AAAI 24] Official Codebase for BridgeQA: Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
Language:Python17 3 51
GaryGuTC/LaPA_model
[CVPRW 2024] LaPA: Latent Prompt Assist Model For Medical Visual Question Answering
Language:Python16 2 70
codezakh/SelTDA
[CVPR 23] Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Language:Python14 3 11
StriveZs/ALPS
ALPS: An Auto-Labeling and Pre-training Scheme for Remote Sensing Segmentation With Segment Anything Model
Language:Python110
bowen-upenn/Multi-Agent-VQA
[CVPR 2024 CVinW] Multi-Agent VQA: Exploring Multi-Agent Foundation Models on Zero-Shot Visual Question Answering
Language:Python9 3 00
ControlNet/HYDRA
[ECCV] HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning
Language:Python8 3 11
Lackel/DKA
[Arxiv 2024] Knowledge Acquisition Disentanglement for Knowledge-based Visual Question Answering with Large Language Models
Language:Python8 2 11
thomaswei-cn/MC-CoT
MC-CoT implementation code
Language:Python6
WHB139426/QA-Prompts
Q&A Prompts: Discovering Rich Visual Clues through Mining Question-Answer Prompts for VQA requiring Diverse World Knowledge [ECCV'24]
Language:Python5 2 40