/Awesome-LLM-Agent

πŸš€ Awesome LLM Agent: Discover LLM-Empowered Agents & Their Exciting Applications

MIT LicenseMIT

Awesome-LLM-Agent

PR Welcome License: MIT Awesome

Welcome to our comprehensive collection on LLM-based agents, with an emphasis on reasoning, memory, action, and related applications. Dive into a diverse array of academic papers, benchmarks, and open-source projects that explore the depths of LLM capabilities. This repo is actively maintained and frequently updated πŸ§‘β€πŸ’». Stay tuned for the latest advancements in the field πŸš€!

Table of Contents

Papers

πŸ”₯ for papers with >100 citations or repositories with >500 stars.

πŸš€ for papers with >300 citations or repositories with >1500 stars.

Survey πŸ”

  • πŸ”₯ (arXiv 2023.08) A Survey on Large Language Model based Autonomous Agents [Paper] [GitHub]
  • πŸ”₯ (arXiv 2023.09) The Rise and Potential of Large Language Model Based Agents: A Survey [Paper] [GitHub]
  • (arXiv 2023.10) AI Alignment: A Comprehensive Survey [Paper]
  • πŸ”₯ (arXiv 2023.12) Retrieval-Augmented Generation for Large Language Models: A Survey [Paper] [GitHub]
  • (arXiv 2024.01) Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security [Paper] [GitHub]
  • (arXiv 2024.01) Large Language Model based Multi-Agents: A Survey of Progress and Challenges [Paper] [GitHub]
  • πŸ”₯ (TMLR'2024) Cognitive Architectures for Language Agents [Paper] [GitHub]
  • (arXiv 2024.01) Agent AI: Surveying the Horizons of Multimodal Interaction [Paper]

Benchmark πŸ“ˆ

  • πŸ”₯ (NeurIPS'2022) WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents [Paper] [GitHub] [Website]
  • (EACL'2023) MTEB: Massive Text Embedding Benchmark [Paper] [GitHub] [Leaderboard]
  • (EMNLP'2023) API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs [Paper] [GitHub]
  • πŸ”₯ (NeurIPS'2023) PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change [Paper] [GitHub]
  • (NeurIPS'2023) ToolQA: A Dataset for LLM Question Answering with External Tools [Paper] [GitHub]
  • (arXiv 2023.09) Benchmarking Large Language Models in Retrieval-Augmented Generation [Paper] [GitHub]
  • πŸ”₯ (ICLR'2024) WebArena: A Realistic Web Environment for Building Autonomous Agents [Paper] [GitHub] [Website]
  • πŸš€ (ICLR'2024) AgentBench: Evaluating LLMs as Agents [Paper] [Github] [Website]
  • (arXiv 2023.10) Benchmarking Large Language Models As AI Research Agents [Paper] [Github]
  • (arXiv 2023.12) T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step [Paper] [GitHub] [Website]
  • (arXiv 2024.01) VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks [Paper] [GitHub] [Website]
  • (arXiv 2024.03) DevBench: A Comprehensive Benchmark for Software Development [Paper] [GitHub]
  • (arXiv 2024.04) AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent [Paper] [GitHub]
  • (arXiv 2024.04) STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [Paper] [GitHub]

Reasoning and Prompt Engineering πŸ’‘

  • πŸš€ (NeurIPS'2022) Chain-of-Thought Prompting Elicits Reasoning in Large Language Models [Paper]
  • πŸš€ (ICLR'2023) ReAct: Synergizing Reasoning and Acting in Language Models [Paper] [GitHub] [Website]
  • πŸ”₯ (arXiv 2023.05) ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models [Paper] [GitHub]
  • πŸ”₯ (EMNLP'2023) Reasoning with Language Model is Planning with World Model [Paper] [GitHub]
  • πŸš€ (NeurIPS'2023) Tree of Thoughts: Deliberate Problem Solving with Large Language Models [Paper] [GitHub]
  • πŸš€ (NeurIPS'2023) Reflexion: Language Agents with Verbal Reinforcement Learning [Paper] [GitHub]
  • πŸš€ (NeurIPS'2023) Self-Refine: Iterative Refinement with Self-Feedback [Paper] [GitHub]
  • (NeurIPS'2023) Self-Evaluation Guided Beam Search for Reasoning [Paper] [GitHub] [Website]
  • πŸš€ (arXiv 2023.08) Graph of Thoughts: Solving Elaborate Problems with Large Language Models [Paper] [GitHub]
  • (ICLR'2024) Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph [Paper] [GitHub]
  • (ICLR'2024) Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models [Paper] [GitHub]
  • (arXiv 2024.01) Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts [Paper]
  • (arXiv 2024.01) Self-Rewarding Language Models [Paper]

Memory and Retrieval Augmented Generation βš™οΈ

  • πŸš€ (PMLR'2022) Improving language models by retrieving from trillions of tokens [Paper] [GitHub]
  • (arXiv 2023.01) REPLUG: Retrieval-Augmented Black-Box Language Models [Paper]
  • πŸ”₯ (EMNLP'2023) Active Retrieval Augmented Generation [Paper] [GitHub]
  • (EMNLP'2023 findings) Self-Knowledge Guided Retrieval Augmentation for Large Language Models [Paper]
  • πŸš€ (ICLR'2024) DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines [Paper] [GitHub]
  • (ICLR'2024) Retrieval meets Long Context Large Language Models [Paper]
  • πŸ”₯ (ICLR'2024) Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection [Paper] [GitHub] [Website]
  • (NAACL'2024) REST: Retrieval-Based Speculative Decoding [Paper] [GitHub]
  • (arXiv 2023.11) Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models [Paper]
  • (arXiv 2024.02) G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [Paper] [GitHub]
  • (arXiv 2024.03) RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation [Paper] [GitHub] [Website] [Demo]
  • (arXiv 2024.03) RAFT: Adapting Language Model to Domain Specific RAG [Paper] [GitHub] [Website]
  • (arXiv 2024.04) Introducing Super RAGs in Mistral 8x7B-v1 [Paper]

Action and Tool Using πŸ› οΈ

  • πŸ”₯ (CVPR'2023) Visual Programming: Compositional visual reasoning without training [Paper] [GitHub]

  • πŸš€ (NeurIPS'2023) Toolformer: Language Models Can Teach Themselves to Use Tools [Paper] [GitHub]

  • πŸš€ (arXiv 2023.05) Gorilla: Large Language Model Connected with Massive APIs [Paper] [GitHub] [Website]

  • (arXiv 2023.05) ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings [Paper] [GitHub]

  • (arXiv 2023.06) ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases [Paper] [GitHub]

  • πŸš€ (ICLR'2024) ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs [Paper] [GitHub]

  • πŸš€ (TMLR'2024) Voyager: An Open-Ended Embodied Agent with Large Language Models [Paper] [GitHub]

Agent Fine-Tuning πŸ€–

  • πŸ”₯ (arXiv 2023.10) AgentTuning: Enabling Generalized Agent Abilities for LLMs [Paper] [GitHub] [Website]

  • (arXiv 2023.10) FireAct: Toward Language Agent Fine-tuning [Paper] [GitHub] [Website]

  • (arXiv 2024.02) AUTOACT: Automatic Agent Learning from Scratch via Self-Planning [Paper] [GitHub] [Website]

  • (arXiv 2024.03) Agent Lumos: Unified and Modular Training for Open-Source Language Agents [Paper] [GitHub] [Website]

  • (arXiv 2024.03) Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models [Paper] [GitHub] [Website]

LLM Fine-Tuning 🧠

  • πŸš€ (NeurIPS'2022) Training language models to follow instructions with human feedback [Paper] [GitHub]
  • πŸš€ (NeurIPS'2023) Direct Preference Optimization: Your Language Model is Secretly a Reward Model [Paper] [GitHub]
  • (arXiv 2024.01) Self-Rewarding Language Models [Paper] [GitHub]
  • (arXiv 2024.02) Noise Contrastive Alignment of Language Models with Explicit Rewards [Paper] [GitHub]

Applications πŸ’»

Web Agents

  • πŸ”₯ (NeurIPS'2023) Mind2Web: Towards a Generalist Agent for the Web [Paper] [GitHub]
  • (NeurIPS'2023 workshops) LASER: LLM Agent with State-Space Exploration for Web Navigation [Paper]
  • (ICLR'2024) A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis [Paper] [GitHub]
  • πŸ”₯ (arXiv 2024.01) GPT-4V(ision) is a Generalist Web Agent, if Grounded [Paper] [Github] [Website]

Recommender Agents

  • (arXiv 2023.08) RecMind: Large Language Model Powered Agent For Recommendation [Paper]
  • (arXiv 2023.10) On Generative Agents in Recommendation [paper] [GitHub]
  • (arXiv 2023.10) AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems [Paper]

Code Agents

  • πŸ”₯ (ICLR'2024) SWE-bench: Can Language Models Resolve Real-World Github Issues? [Paper] [GitHub] [Website]
  • πŸš€ (arXiv 2024.04) AutoCodeRover: Autonomous Program Improvement [Paper] [GitHub]
  • (arXiv 2024.04) Can Language Models Solve Olympiad Programming? [Paper] [GitHub]

Paper Review Agents

  • (arXiv 2023.10) Can large language models provide useful feedback on research papers? A large-scale empirical analysis [Paper] [GitHub]
  • (arXiv 2024.01) MARG: Multi-Agent Review Generation for Scientific Papers [Paper] [GitHub]
  • (arXiv 2024.02) Reviewer2: Optimizing Review Generation Through Prompt Generation [Paper] [GitHub]
  • (CHI'2024) A Design Space for Intelligent and Interactive Writing Assistants [Paper] [GitHub] [Website]

Trading Agents

  • (ICLR'2024) SocioDojo: Building Lifelong Analytical Agents with Real-world Text and Time Series [Paper] [GitHub]
  • (ICLR'2024 workshops) FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design [Paper]

Others

  • πŸš€ (UIST'2023) Generative Agents: Interactive Simulacra of Human Behavior [Paper] [GitHub]

  • πŸš€ (NeurIPS'2023) HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face. [Paper] [GitHub]

  • πŸ”₯ (ICLR'2024) ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving [Paper] [GitHub]

  • (arXiv 2023.04) Octopus v2: On-device language model for super agent [Paper]

  • (arXiv 2024.04) Empowering Biomedical Discovery with AI Agents [Paper]

Open-Source Projects

LLM Platform

Title Link Description
FastChat lm-sys/FastChat GitHub stars An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
πŸ¦œοΈπŸ”— LangChain langchain-ai/langchain GitHub stars πŸ¦œπŸ”— Build context-aware reasoning applications
πŸ—‚οΈ LlamaIndex πŸ¦™ run-llama/llama_index GitHub stars LlamaIndex is a data framework for your LLM applications
LLaMA-Factory hiyouga/Llama-Factory GitHub stars Unify Efficient Fine-Tuning of 100+ LLMs
Petals🌸 bigscience-workshop/petals GitHub stars 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
Open-Assistant LAION-AI/Open-Assistant GitHub stars OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Multi-Agent Framework

Title Link Description
CAMEL🐫 camel-ai/camel GitHub stars 🐫 CAMEL: Communicative Agents for β€œMind” Exploration of Large Language Model Society
AutoGen microsoft/autogen GitHub stars A programming framework for agentic AI.
πŸ€– AgentVerseπŸͺ OpenBMB/AgentVerse GitHub stars πŸ€– AgentVerse πŸͺ is designed to facilitate the deployment of multiple LLM-based agents in various applications, which primarily provides two frameworks: task-solving and simulation

Vector Database

Title Link Description
Chroma chroma-core/chroma GitHub stars the AI-native open-source embedding database
Faiss facebookresearch/faissGitHub stars A library for efficient similarity search and clustering of dense vectors.