visual-reasoning
There are 42 repositories under visual-reasoning topic.
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
MILVLG/mcan-vqa
Deep Modular Co-Attention Networks for Visual Question Answering
ethanjperez/film
FiLM: Visual Reasoning with a General Conditioning Layer
floodsung/Deep-Reasoning-Papers
Recent Papers including Neural Symbolic Reasoning, Logical Reasoning, Visual Reasoning, planning and any other topics connecting deep learning and reasoning
CSfufu/Revisual-R1
🚀ReVisual-R1 is a 7B open-source multimodal language model that follows a three-stage curriculum—cold-start pre-training, multimodal reinforcement learning, and text-only reinforcement learning—to achieve faithful, concise, and self-reflective state-of-the-art performance in visual and textual reasoning.
LAMDASZ-ML/Awesome-LLM-Reasoning-with-NeSy
✨✨Latest Advances on Neuro-Symbolic Learning in the era of Large Language Models
WellyZhang/RAVEN
RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
eric-ai-lab/GRIT
Official code for NeurIPS 2025 paper "GRIT: Teaching MLLMs to Think with Images"
keshik6/HourVideo
[NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding
shijx12/XNM-Net
Pytorch implementation of "Explainable and Explicit Visual Reasoning over Scene Graphs "
NVlabs/Bongard-HOI
[CVPR 2022 (oral)] Bongard-HOI for benchmarking few-shot visual reasoning
MSR3D/MSR3D
[NeurIPS 2024] Official code repository for MSR3D paper
NVlabs/RelViT
[ICLR 2022] RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning
cobanov/image-captioning
Image captioning using python and BLIP
hughplay/TVR
:boom: Transformation Driven Visual Reasoning - CVPR 2021
yangjie-cv/WeThink
WeThink: Toward General-purpose Vision-Language Reasoning via Reinforcement Learning
bezorro/ACMN-Pytorch
Visual Question Reasoning on General Dependency Tree
hughplay/Visual-Reasoning-Papers
📄 A curated list of visual reasoning papers.
WellyZhang/CoPINet
Learning Perceptual Inference by Contrasting
WellyZhang/PrAE
Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution
catalina17/VideoNavQA
An alternative EQA paradigm and informative benchmark + models (BMVC 2019, ViGIL 2019 spotlight)
jaleedkhan/neusire
NeuSyRE: A Neuro-Symbolic Visual Understanding and Reasoning Framework based on Scene Graph Enrichment
WellyZhang/ACRE
ACRE: Abstract Causal REasoning Beyond Covariation
Liyan06/ChartMuseum
[NeurIPS 2025] ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models
WellyZhang/ALANS
Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning
aelnouby/Relational-Networks
Pytorch implementation of " A simple neural network module for relational reasoning" paper aka Relational networks for visual reasoning.
andrewliao11/LongPerceptualThoughts
[COLM'25] The official implementation of "LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception"
marialymperaiou/knowledge-enhanced-multimodal-learning
A list of research papers on knowledge-enhanced multimodal learning
Sina-Baharlou/VisualGenome-to-Depth
Convert RGB images of Visual-Genome dataset to Depth Maps.
wentaoheunnc/HCV-ARR
[AAAI 2023] Hierarchical ConViT with Attention-based Relational Reasoner for Visual Analogical Reasoning
raminguyen/LLMP2
Evaluating ‘Graphical Perception’ with Multimodal Large Language Models
alexmirrington/honours-thesis
LaTeX files for my honours thesis: "Graph Attention Networks for Compositional Visual Question Answering"
jaehyunnn/RelationalNetwork_pytorch
An un-official implementation of Relational Network [A. Santoro et al., 2017] (PyTorch)
rs9000/VisualReasoning_MMnet
Visual reasoning modular memory network
markvasin/openvqa
Implementation of the VQA model from my MSc project