/Awesome-Reasoning-Foundation-Models

✨✨Latest Papers and Benchmarks in Reasoning with Foundation Models

MIT LicenseMIT

Awesome-Reasoning-Foundation-Models

Awesome DOI arXiv

overview

survey.pdf | A curated list of awesome large AI models, or foundation models, for reasoning.

We organize the current foundation models into three categories: language foundation models, vision foundation models, and multimodal foundation models. Further, we elaborate the foundation models in reasoning tasks, including commonsense, mathematical, logical, causal, visual, audio, multimodal, agent reasoning, etc. Reasoning techniques, including pre-training, fine-tuning, alignment training, mixture of experts, in-context learning, and autonomous agent, are also summarized.

We welcome contributions to this repository to add more resources. Please submit a pull request if you want to contribute! See CONTRIBUTING.

Table of Contents

table of contents

0 Survey

overview

This repository is primarily based on the following paper:

A Survey of Reasoning with Foundation Models

Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, Hongyang Li, Mengzhe Geng, Yue Wu, Wenhai Wang, Junsong Chen, Zhangyue Yin, Xiaozhe Ren, Jie Fu, Junxian He, Wu Yuan, Qi Liu, Xihui Liu, Yu Li, Hao Dong, Yu Cheng, Ming Zhang, Pheng Ann Heng, Jifeng Dai, Ping Luo, Jingdong Wang, Ji-Rong Wen, Xipeng Qiu, Yike Guo, Hui Xiong, Qun Liu, and Zhenguo Li

If you find this repository helpful, please consider citing:

@article{sun2023survey,
  title={A Survey of Reasoning with Foundation Models},
  author={Sun, Jiankai and Zheng, Chuanyang and Xie, Enze and Liu, Zhengying and Chu, Ruihang and Qiu, Jianing and Xu, Jiaqi and Ding, Mingyu and Li, Hongyang and Geng, Mengzhe and others},
  journal={arXiv preprint arXiv:2312.11562},
  year={2023}
}

1 Relevant Surveys and Links

relevant surveys
  • The Rise and Potential of Large Language Model Based Agents: A Survey - [arXiv] [Link]

  • Multimodal Foundation Models: From Specialists to General-Purpose Assistants - [arXiv] [Tutorial]

  • A Survey on Multimodal Large Language Models - [arXiv] [Link]

  • Interactive Natural Language Processing - [arXiv] [Link]

  • A Survey of Large Language Models - [arXiv] [Link]

  • Self-Supervised Multimodal Learning: A Survey - [arXiv] [Link]

  • Large AI Models in Health Informatics: Applications, Challenges, and the Future - [arXiv] [Paper] [Link]

  • Towards Reasoning in Large Language Models: A Survey - [arXiv] [Paper] [Link]

  • Reasoning with Language Model Prompting: A Survey - [arXiv] [Paper] [Link]

  • Awesome Multimodal Reasoning - [Link]

2 Foundation Models

foundation models

foundation_models

2.1 Language Foundation Models

LFMs

2.2 Vision Foundation Models

VFMs

2.3 Multimodal Foundation Models

MFMs

2.4 Reasoning Applications

reasoning applications

3 Reasoning Tasks

reasoning tasks

3.1 Commonsense Reasoning

commonsense reasoning

3.1.1 Commonsense Question and Answering (QA)

3.1.2 Physical Commonsense Reasoning

3.1.3 Spatial Commonsense Reasoning

3.1.x Benchmarks, Datasets, and Metrics


3.2 Mathematical Reasoning

mathematical reasoning

3.2.1 Arithmetic Reasoning

3.2.2 Geometry Reasoning

3.2.3 Theorem Proving

3.2.4 Scientific Reasoning

3.2.x Benchmarks, Datasets, and Metrics


3.3 Logical Reasoning

logical reasoning

3.3.1 Propositional Logic

  • 2022/09 | Propositional Reasoning via Neural Transformer Language Models - [Paper]

3.3.2 Predicate Logic

3.3.x Benchmarks, Datasets, and Metrics


3.4 Causal Reasoning

causal reasoning

3.4.1 Counterfactual Reasoning

3.4.x Benchmarks, Datasets, and Metrics


3.5 Visual Reasoning

visual reasoning

3.5.1 3D Reasoning

3.5.x Benchmarks, Datasets, and Metrics


3.6 Audio Reasoning

audio reasoning

3.6.1 Speech

3.6.x Benchmarks, Datasets, and Metrics


3.7 Multimodal Reasoning

multimodal reasoning

3.7.1 Alignment

3.7.2 Generation

3.7.3 Multimodal Understanding

3.7.x Benchmarks, Datasets, and Metrics


3.8 Agent Reasoning

agent reasoning

3.8.1 Introspective Reasoning

3.8.2 Extrospective Reasoning

3.8.3 Multi-agent Reasoning

3.8.4 Driving Reasoning

3.8.x Benchmarks, Datasets, and Metrics


3.9 Other Tasks and Applications

other tasks and applications

3.9.1 Theory of Mind (ToM)

3.9.2 LLMs for Weather Prediction

  • 2022/09 | MetNet-2 | Deep learning for twelve hour precipitation forecasts - [Paper]

  • 2023/07 | Pangu-Weather | Accurate medium-range global weather forecasting with 3D neural networks - [Paper]

3.9.3 Abstract Reasoning

3.9.4 Defeasible Reasoning

3.9.5 Medical Reasoning

3.9.6 Bioinformatics Reasoning

3.9.7 Long-Chain Reasoning


4 Reasoning Techniques

reasoning techniques

4.1 Pre-Training

pre-training

4.1.1 Data

a. Data - Text
b. Data - Image
c. Data - Multimodality

4.1.2 Network Architecture

a. Encoder-Decoder
b. Decoder-Only
c. CLIP Variants
d. Others

4.2 Fine-Tuning

fine-tuning

4.2.1 Data

4.2.2 Parameter-Efficient Fine-tuning

a. Adapter Tuning
b. Low-Rank Adaptation
c. Prompt Tuning
d. Partial Parameter Tuning
e. Mixture-of-Modality Adaption

4.3 Alignment Training

alignment training

4.3.1 Data

a. Data - Human
b. Data - Synthesis

4.3.2 Training Pipeline

a. Online Human Preference Training
b. Offline Human Preference Training

4.4 Mixture of Experts (MoE)

mixture of experts

4.5 In-Context Learning

in-context learning

4.5.1 Demonstration Example Selection

a. Prior-Knowledge Approach
b. Retrieval Approach

4.5.2 Chain-of-Thought

a. Zero-Shot CoT
b. Few-Shot CoT
c. Multiple Paths Aggregation

4.5.3 Multi-Round Prompting

a. Learned Refiners
b. Prompted Refiners

4.6 Autonomous Agent

autonomous agent