Here you find a collection of material (books, papers, blog-posts etc.) related to reasoning and cognition in AI systems. Specifically we want to cover agents, cognitive architectures, general problem solving strategies and self-improvement.
The term "System 2" in the page title refers to the slower, more deliberative, and more logical mode of thought as described by Daniel Kahneman in his book Thinking, Fast and Slow.
You know a great resource we should add? Please see How to contribute.
(looking for additional links & articles and summaries)
- SOAR (State, Operator, And Result) by John Laird, Allen Newell, and Paul Rosenbloom
- ACT-R (Adaptive Control of Thought-Rational) by John Anderson at CMU
- SPAUN (Semantic Pointer Architecture Unified Network) by Chris Eliasmith at Waterloo, SPAUN 2.0 by Feng-Xuan Choo
- ART (Adaptive resonance theory) by Stephen Grossberg and Gail Carpenter
- CLARION (Connectionist Learning with Adaptive Rule Induction ON-line) by Ron Sun
- EPIC (Executive Process/Interactive Control) by David Kieras and David Meyer
- LIDA (Learning Intelligent Distribution Agent) by Stan Franklin
- Sigma by Paul Rosenbloom
- OpenCog by Ben Goertzel
- NARS (Non-Axiomatic Reasoning System) by Pei Wang
- Icarus by Pat Langley
- MicroPsi by Joscha Bach
- Thousand Brains Theory & HTM (Hierarchical Temporal Memory) by Jeff Hawkins
- SPH (Sparse Predictive Hierarchie) by Eric Laukien
- Leabra (Local, Error-driven and Associative, Biologically Realistic Algorithm), 2016 Paper by Randall O'Reilly
- CogNGen (COGnitive Neural GENerative system) by Alexander Ororbia and Mary Alexandria Kelly, see also here and here
- KIX (KIX: A Metacognitive Generalization Framework) by A. Kumar and Paul Schrater
- The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery, gh: SakanaAI/AI-Scientist
- OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
- SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering
- TextGrad: Automatic "Differentiation" via Text
- ReAct: Synergizing Reasoning and Acting in Language Models
- Agentless: Demystifying LLM-based Software Engineering Agents
- Competition-Level Code Generation with AlphaCode
- AI Agents That Matter
- Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning
- Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents
- Self-Rewarding Language Models
- ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models
- MedAgent-Zero: Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents
- Self-Discover: Large Language Models Self-Compose Reasoning Structures
- Cognitive Architectures for Language Agents
- Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
- Large Language Models Can Self-Improve At Web Agent Tasks
- AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation
- A Prefrontal Cortex-inspired Architecture for Planning in Large Language Models
- CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization
- DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
- Survey: Reasoning with Large Language Models, a Survey (Jul 2024)
- Survey: From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future (Aug 2024)
- Voyager: An Open-Ended Embodied Agent with Large Language Models
- JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models
- Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks
- STEVE Series: Step-by-Step Construction of Agent Systems in Minecraft
- Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
- Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
- AlphaCode 2 Technical Report
- A Path Towards Autonomous Machine Intelligence
- GAIA-1: A Generative World Model for Autonomous Driving
- Latent space world-models: Dreamer, V2, V3, DayDreamer
- World Models, web: project page
- HYSYNTH: Context-Free LLM Approximation for Guiding Program Synthesis
- SymbolicAI: A framework for logic-based approaches combining generative models and solvers
- DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning
- A Neuro-vector-symbolic Architecture for Solving Raven's Progressive Matrices
- Surveys:
- (Feb 2024) A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications
- Prompt Engineering Guide Prompting Techniques
- Prompting Fundamentals and How to Apply them Effectively by Eugene Yan
- Chain-of-Thoughts (COT): Paper
- Tree-of-Thoughts (ToT): Paper
- Graph-of-Thoughts (GoT): Paper, code
- Algorithm of Thoughts (AoT): Paper
- Chain-of-Verification (CoVe/CoV): Paper
- Mixture-of-Agents (MoA): Paper
- Tool-Integrated Reasoning (ToRA / TIR): Paper
- Program of Thoughts (PoT): Paper
- Buffer of Thoughts (BoT): Paper
- Chain of Code (CoC): Paper
- DeepMind AlphaProof and AlphaGeometry 2
- Getting 50% (SoTA) on ARC-AGI with GPT-4o
- Schmidhuber: Artificial Curiosity & Creativity
- synthesis.ai: Do Androids Dream? World Models in Modern AI
- Our Transformers Code Agent beats the GAIA benchmark!
- https://lilianweng.github.io/posts/2023-06-23-agent/
- Lil'Log LLM Powered Autonomous Agents (Jun 2023 )
- Distill A Gentle Introduction to Graph Neural Networks (2021)
- Geometric Deep Learning - Grids, Groups, Graphs, Geodesics, and Gauges
Answering logical queries over Incomplete Knowledge Graphs. Aspirationally this requires combining sparse symbolic index collation (SQL, SPARQL, etc) and dense vector search, preferably in a differentiable manner.
- Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases
- Adapting Neural Link Predictors for Data-Efficient Complex Query Answering
- Generalizing Knowledge Graph Embedding with Universal Orthogonal Parameterization
- Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding
- Wasserstein-Fisher-Rao Embedding: Logical Query Embeddings with Local Comparison and Global Transport
- GammaE: Gamma Embeddings for Logical Queries on Knowledge Graphs
- Soft Reasoning on Uncertain Knowledge Graphs
Similar to the regular CQLA, but with the emphasis on the "Inductive Setting" - i.e. querying over new, unseen during training nodes, edge types or even entire graphs. The latter part is interesting as it relies on the higher order "relations between relations" structure, connecting KG inference to Category Theory.
- Zero-shot Logical Query Reasoning on any Knowledge Graph
- Extending Transductive Knowledge Graph Embedding Models for Inductive Logical Relational Inference
- Neural-Symbolic Models for Logical Queries on Knowledge Graphs
- InGram: Inductive Knowledge Graph Embedding via Relation Graphs
Initially attempted back in 2014 with general-purpose but unstable Neural Turing Machines, modern NAR approaches limit their scope to making GNN-based "Algorithmic Processor Networks" which learn to mimic classical algorithms on synthetic data and can be deployed on noisy real-world problems by sandwiching their frozen instances inside Encoder-Processor-Decoder architecture.
- Neural Turing Machines, 2014
- A Generalist Neural Algorithmic Learner
- Transformers meet Neural Algorithmic Reasoners
- Recursive Algorithmic Reasoning
- Dual Algorithmic Reasoning
- Learning to Configure Computer Networks with Neural Algorithmic Reasoning
- Deep Networks Always Grok and Here is Why
- Grokfast: Accelerated Grokking by Amplifying Slow Gradients, review post by Lucas Nestler
- Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets
Weak methods are general but don't use knowledge (heuristics) to guide the search process.
- depth-first-search (DFS)
- breadth-first-search (BFS)
- depth-limited-search, iterative-deepening-depth-first-search (IDDFS)
- generate-and-test
- hill-climbing (borderline case between weak and strong methods)
- The Soar Cognitive Architecture, John E. Laird, MIT Press, 2019
- How to Build a Brain: A Neural Architecture for Biological Cognition Chris Eliasmith, Oxford Series on Cognitive Models and Architectures, 2013
- Active Inference: The Free Energy Principle in Mind, Brain, and Behavior, Thomas Parr, Giovanni Pezzulo, Karl J. Friston, MIT Press, 2022, MLST Interview with Thomas Parr
- Principles of Synthetic Intelligence PSI: An Architecture of Motivated Cognition, Joscha Bach, Oxford Series on Cognitive Models and Architectures Book 4, 2009
- Conscious Mind, Resonant Brain: How Each Brain Makes a Mind, Stephen Grossberg, Oxford University Press, 2021
- The Society of Mind, Marvin Minsky, Simon & Schuster, 1986
- Reinforcement Learning: An Introduction 2nd Edition, Sutton & Barto, MIT Press, 2018
- Mathematical Foundations of Reinforcement Learning, Shiyu Zhao, open course on github + video lectures
Diverse approaches some of which tap into classical PDE systems of biological NNs, some concentrate on Distibuted Sparse Representations (by default non-differentiable), others draw inspiration from Hippocampal Grid Cells, Place Cells, etc. Biological systems surpass most ML methods for Continual and Online Learning, but are hard to implement efficienly on GPU.
- Ogma Sparse Predictive Hierarchies (SPH): whitepaper
- The Tolman-Eichenbaum Machine: Unifying space and relational memory through generalisation in the hippocampal formation (TEM), TEM-t
- Arousal as a universal embedding for spatiotemporal brain dynamics
- Sparse Distributed Memory is a Continual Learner
- Computation with Sequences of Assemblies in a Model of the Brain
Dense Associative Memory is mainly represented by Modern Hopfield Networks (MHN), which can be viewed as a generalized Transformers capable of storing queries, keys and values explicitly (as in Vector Databases) and running recurrent retrival by energy minimization (relating them to Diffusion models). Application for Continual Learning is possible when combined with uncertainty quantification and differentiable top-k selection.
- xLSTM repository
- CAMELoT: Towards Large Language Models with Training-Free Consolidated Associative Memory
- Energy Transformer
- Memorization and consolidation in associative memory networks
- Simplicial Hopfield networks
- paul-gauthier/aider
- continuedev/continue
- OpenDevin
- princeton-nlp/SWE-agent, documentation
- meta-llama/llama-agentic-system
- stanfordnlp/dspy
- InternLM/lagent - lightweight framework for building LLM-based agents
- Software Engineering
- CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents, gh: camel-ai/crab
- WebArena: A Realistic Web Environment for Building Autonomous Agents, web: project page, Leaderboard
- ARC-AGI: Leaderboard, On the Measure of Intelligence
- PlanBench: Paper, gh: karthikv792/LLMs-Planning
- GAIA: a benchmark for General AI Assistants: Leaderboard
- StreamBench: Towards Benchmarking Continuous Improvement of Language Agents, gh: stream-bench/stream-bench
- VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
- SWE-bench, SWE-bench Lite
- BigCodeBench: The Next Generation of HumanEval, Leaderboard
- SciCode: A Research Coding Benchmark Curated by Scientists, web: https://scicode-bench.github.io/
- Nous Research Open Reasoning Tasks, a list of reasoning tasks, gh: NousResearch/Open-Reasoning-Tasks
- Channel: David Shapiro
- Artem Kirsanov: Engrams, Building Blocks of Memory in the Brain
- Channel: Edan Meyer on AI, ML & RL, Discrete vs. Continuous RL + Paper
- MIT AGI: Cognitive Architecture (Nate Derbinsky)
- Channel: Thinking About Thinking (Mathematics of Neuroscience and AI)
- Machine Consciousness
- Consciousness as a coherence-inducing operator Talk by Josha Bach at the Models of Consciousness Conferences
https://s2r-at-scale-workshop.github.io (NeurIPS 2024)
To share a link related to reasoning in AI systems that is missing here please create a pull request for this file. See editing files in the github documentation.