mllm-reasoning
There are 11 repositories under mllm-reasoning topic.
yaotingwangofficial/Awesome-MCoT
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
ritzz-ai/GUI-R1
Official implementation of GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
Wild-Cooperation-Hub/Awesome-MLLM-Reasoning-Benchmarks
A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.
manglu097/Chiron-o1
[NIPS 2025] Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search
luo-junyu/FinMME
[ACL 2025] FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation
YutingLi0606/Vision-Matters
(ArXiv25) Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning
Kun-Xiang/AtomThink
Offical Repository of "AtomThink: Multimodal Slow Thinking with Atomic Step Reasoning"
falonss703/Awesome-Uncertainty-based-Reinforcement-Learning
🔥🔥🔥Latest Papers, Codes on Uncertainty-based RL
Jorffy/NoteMR
[CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".
SkyworkAI/CSVQA
A Multimodal Benchmark for Evaluating Scientific Reasoning Capabilities of VLMs
vulab-AI/YESBUT-v2
We introduce the YesBut-v2, a benchmark for assessing AI's ability to interpret juxtaposed comic panels with contradictory narratives. Unlike existing benchmarks, it emphasizes visual understanding, comparative reasoning, and social knowledge.