Pinned Repositories
awesome-RSVLM
Collection of Remote Sensing Vision-Language Models
GroundVLP
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)
habitat-lab
A modular high-level library to train embodied AI agents across a variety of tasks, environments, and simulators.
OmAgent
A multimodal agent framework for solving complex tasks [EMNLP'2024]
OmChat
A suite of multimodal language models that are powerful and efficient
OmDet
Real-time and accurate open-vocabulary end-to-end object detection
OmModel
A collection of strong multimodal models for building multimodal AGI agents
OVDEval
A Comprehensive Evaluation Benchmark for Open-Vocabulary Detection (AAAI 2024)
RS5M
RS5M: a large-scale vision language dataset for remote sensing [TGRS]
VL-CheckList
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]
Om AI Research Lab's Repositories
om-ai-lab/OmDet
Real-time and accurate open-vocabulary end-to-end object detection
om-ai-lab/OmAgent
A multimodal agent framework for solving complex tasks [EMNLP'2024]
om-ai-lab/RS5M
RS5M: a large-scale vision language dataset for remote sensing [TGRS]
om-ai-lab/VL-CheckList
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]
om-ai-lab/awesome-RSVLM
Collection of Remote Sensing Vision-Language Models
om-ai-lab/GroundVLP
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection (AAAI 2024)
om-ai-lab/OmModel
A collection of strong multimodal models for building multimodal AGI agents
om-ai-lab/OVDEval
A Comprehensive Evaluation Benchmark for Open-Vocabulary Detection (AAAI 2024)
om-ai-lab/OmChat
A suite of multimodal language models that are powerful and efficient
om-ai-lab/habitat-lab
A modular high-level library to train embodied AI agents across a variety of tasks, environments, and simulators.
om-ai-lab/bottom-up-attention.pytorch
An PyTorch reimplementation of bottom-up-attention models