/LLM-Agents-Papers

A repo lists papers related to LLM based agent

Primary LanguagePython

LLM-Agents-Papers

✍️ Description

Last Updated Time: 2024/1/13

A repo lists papers related to LLM based agent. Includes

  • methods of role playing, memory mechanism and game playing
  • methods of feedback or reflection
  • methods of multi-agent collaboration
  • methods of tool usage or human-agent interaction
  • benchmarks and surveys of the field
  • environments or platforms
  • agent fine-tuning

💛 Recommendation

For more comprehensive reading, we also recommend other paper lists:

📰 Papers

  • Survey
    • [2024/01/01] If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents | [paper] | [code]

    • [2023/12/31] A Survey of Personality, Persona, and Profile in Conversational Agents and Chatbots | [paper] | [code]

    • [2023/12/19] Large Language Models Empowered Agent-based Modeling and Simulation: A Survey and Perspectives | [paper] | [code]

    • [2023/09/14] The Rise and Potential of Large Language Model Based Agents: A Survey | [paper] | [code]

    • [2023/08/22] A Survey on Large Language Model based Autonomous Agents | [paper] | [code]

    • [2023/06/27] Next Steps for Human-Centered Generative AI: A Technical Perspective | [paper] | [code]

    • [2023/04/06] Can Large Language Models Play Text Games Well? Current State-of-the-Art and Open Questions | [paper] | [code]


  • Agent Fine-tuning
    • [2024/01/10] Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training | [paper] | [code]

    • [2024/01/10] Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk | [paper] | [code]

    • [2024/01/05] From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models | [paper] | [code]

    • [2023/12/20] Machine Mindset: An MBTI Exploration of Large Language Models | [paper] | [code]

    • [2023/10/19] AgentTuning: Enabling Generalized Agent Abilities for LLMs | [paper] | [code]

    • [2023/10/09] FireAct: Toward Language Agent Fine-tuning | [paper] | [code]

    • [2023/10/01] Adapting LLM Agents Through Communication | [paper] | [code]


  • Role Playing
    • [2024/01/10] AUTOACT: Automatic Agent Learning from Scratch via Self-Planning | [paper] | [code]

    • [2024/01/09] Agent Alignment in Evolving Social Norms | [paper] | [code]

    • [2023/12/28] Experiential Co-Learning of Software-Developing Agents | [paper] | [code]

    • [2023/12/27] Automating Knowledge Acquisition for Content-Centric Cognitive Agents Using LLMs | [paper] | [code]

    • [2023/12/21] ChatGPT as a commenter to the news: can LLMs generate human-like opinions? | [paper] | [code]

    • [2023/12/19] Can ChatGPT be Your Personal Medical Assistant? | [paper] | [code]

    • [2023/12/06] LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem | [paper] | [code]

    • [2023/11/28] War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars | [paper] | [code]

    • [2023/11/10] Smart Agent-Based Modeling: On the Use of Large Language Models in Computer Simulations | [paper] | [code]

    • [2023/10/01] RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models | [paper] | [code]

    • [2023/09/08] Unleashing the Power of Graph Learning through LLM-based Autonomous Agents | [paper] | [code]

    • [2023/09/05] Cognitive Architectures for Language Agents | [paper] | [code]

    • [2023/08/14] ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate | [paper] | [code]

    • [2023/08/10] LLM As DBA | [paper] | [code]

    • [2023/08/07] TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents | [paper] | [code]

    • [2023/07/24] To Infinity and Beyond: SHOW-1 and Showrunner Agents in Multi-Agent Simulations | [paper] | [code]

    • [2023/06/28] Inferring the Goals of Communicating Agents from Actions and Instructions | [paper] | [code]

    • [2023/05/27] SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks | [paper] | [code]

    • [2023/05/26] Training Socially Aligned Language Models in Simulated Human Society | [paper] | [code]

    • [2023/05/25] Role-Play with Large Language Models | [paper] | [code]

    • [2023/05/24] Reasoning with Language Model is Planning with World Model | [paper] | [code]

    • [2023/05/17] Tree of Thoughts: Deliberate Problem Solving with Large Language Models | [paper] | [code]

    • [2023/05/09] TidyBot: Personalized Robot Assistance with Large Language Models | [paper] | [code]

    • [2023/05/02] The Role of Summarization in Generative Agents: A Preliminary Perspective | [paper] | [code]

    • [2023/04/26] Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models | [paper] | [code]

    • [2023/04/24] ChatLLM Network: More brains, More intelligence | [paper] | [code]

    • [2023/04/15] Self-collaboration Code Generation via ChatGPT | [paper] | [code]

    • [2023/04/07] Generative Agents: Interactive Simulacra of Human Behavior | [paper] | [code]

    • [2023/03/31] CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society | [paper] | [code]

    • [2022/12/08] LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models | [paper] | [code]


  • Multi-Agent Collaboration
    • [2024/01/08] MARG: Multi-Agent Review Generation for Scientific Papers | [paper] | [code]

    • [2024/01/08] SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems | [paper] | [code]

    • [2024/01/08] Why Solving Multi-agent Path Finding with Large Language Model has not Succeeded Yet | [paper] | [code]

    • [2023/12/20] AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation | [paper] | [code]

    • [2023/10/10] MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents | [paper] | [code]

    • [2023/10/03] Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View | [paper] | [code]

    • [2023/09/22] Learning to Coordinate with Anyone | [paper] | [code]

    • [2023/08/21] AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents | [paper] | [code]

    • [2023/08/03] InterAct: Exploring the Potentials of ChatGPT as a Cooperative Agent | [paper] | [code]

    • [2023/08/01] MetaGPT: Meta Programming for Multi-Agent Collaborative Framework | [paper] | [code]

    • [2023/07/16] Communicative Agents for Software Development | [paper] | [code]

    • [2023/07/11] Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration | [paper] | [code]

    • [2023/07/05] Building Cooperative Embodied Agents Modularly with Large Language Models | [paper] | [code]

    • [2023/06/05] Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents | [paper] | [code]


  • Feedback&Reflection
    • [2023/11/14] The ART of LLM Refinement: Ask, Refine, and Trust | [paper] | [code]

    • [2023/10/31] Learning From Mistakes Makes LLM Better Reasoner | [paper] | [code]

    • [2023/08/01] SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning | [paper] | [code]

    • [2023/07/27] PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback | [paper] | [code]

    • [2023/05/30] Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate | [paper] | [code]

    • [2023/05/26] AdaPlanner: Adaptive Planning from Feedback with Language Models | [paper] | [code]

    • [2023/05/22] Making Language Models Better Tool Learners with Execution Feedback | [paper] | [code]

    • [2023/04/21] Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents Through Help Feedback | [paper] | [code]

    • [2023/04/11] Teaching Large Language Models to Self-Debug | [paper] | [code]

    • [2023/03/30] Self-Refine: Iterative Refinement with Self-Feedback | [paper] | [code]


  • Memory Mechanism
    • [2023/12/22] Empowering Working Memory for Large Language Model Agents | [paper] | [code]

    • [2023/12/22] Evolving Large Language Model Assistant with Long-Term Conditional Memory | [paper] | [code]

    • [2023/10/16] CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization | [paper] | [code]

    • [2023/06/06] ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory | [paper] | [code]

    • [2023/05/31] Monotonic Location Attention for Length Generalization | [paper] | [code]

    • [2023/05/26] Randomized Positional Encodings Boost Length Generalization of Transformers | [paper] | [code]

    • [2023/05/25] Landmark Attention: Random-Access Infinite Context Length for Transformers | [paper] | [code]

    • [2023/05/24] Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration | [paper] | [code]

    • [2023/05/24] Adapting Language Models to Compress Contexts | [paper] | [code]

    • [2023/05/23] RET-LLM: Towards a General Read-Write Memory for Large Language Models | [paper] | [code]

    • [2023/05/22] RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text | [paper] | [code]

    • [2023/05/19] ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings | [paper] | [code]

    • [2023/05/17] MemoryBank: Enhancing Large Language Models with Long-Term Memory | [paper] | [code]

    • [2023/05/15] Small Models are Valuable Plug-ins for Large Language Models | [paper] | [code]

    • [2023/05/02] Unlimiformer: Long-Range Transformers with Unlimited Length Input | [paper] | [code]

    • [2023/05/01] Learning to Reason and Memorize with Self-Notes | [paper] | [code]

    • [2023/04/27] ChatLog: Recording and Analyzing ChatGPT Across Time | [paper] | [code]

    • [2023/04/26] Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System | [paper] | [code]

    • [2023/04/21] Emergent and Predictable Memorization in Large Language Models | [paper] | [code]

    • [2023/03/17] CoLT5: Faster Long-Range Transformers with Conditional Computation | [paper] | [code]


  • Game Playing
    • [2023/12/29] Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game | [paper] | [code]

    • [2023/10/31] Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models | [paper] | [code]

    • [2023/09/29] Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4 | [paper] | [code]

    • [2023/09/10] An Appraisal-Based Chain-Of-Emotion Architecture for Affective Language Model Game Agents | [paper] | [code]

    • [2023/09/09] Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf | [paper] | [code]

    • [2023/08/23] Are ChatGPT and GPT-4 Good Poker Players? -- A Pre-Flop Analysis | [paper] | [code]

    • [2023/05/31] Recursive Metropolis-Hastings Naming Game: Symbol Emergence in a Multi-agent System based on Probabilistic Generative Models | [paper] | [code]

    • [2023/05/26] Playing repeated games with Large Language Models | [paper] | [code]

    • [2023/05/25] Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory | [paper] | [code]

    • [2023/05/25] Voyager: An Open-Ended Embodied Agent with Large Language Models | [paper] | [code]

    • [2023/05/17] Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback | [paper] | [code]

    • [2023/05/08] Knowledge-enhanced Agents for Interactive Text Games | [paper] | [code]

    • [2023/03/29] Plan4MC: Skill Reinforcement Learning and Planning for Open-World Minecraft Tasks | [paper] | [code]


  • Game Platform
    • [2023/03/14] CB2: Collaborative Natural Language Interaction Research Platform | [paper] | [code]

  • Benchmark&Evaluation&Framework
    • [2024/01/05] AFSPP: Agent Framework for Shaping Preference and Personality with Large Language Models | [paper] | [code]

    • [2024/01/02] CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation | [paper] | [code]

    • [2023/12/28] How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation | [paper] | [code]

    • [2023/12/26] RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models | [paper] | [code]

    • [2023/11/17] Testing Language Model Agents Safely in the Wild | [paper] | [code]

    • [2023/11/16] ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks | [paper] | [code]

    • [2023/11/15] ToolTalk: Evaluating Tool-Usage in a Conversational Setting | [paper] | [code]

    • [2023/11/02] ProAgent: From Robotic Process Automation to Agentic Process Automation | [paper] | [code]

    • [2023/10/24] FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions | [paper] | [code]

    • [2023/10/09] Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena | [paper] | [code]

    • [2023/10/02] SmartPlay : A Benchmark for LLMs as Intelligent Agents | [paper] | [code]

    • [2023/09/29] Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency | [paper] | [code]

    • [2023/09/14] Agents: An Open-source Framework for Autonomous Language Agents | [paper] | [code]

    • [2023/08/22] ProAgent: Building Proactive Cooperative AI with Large Language Models | [paper] | [code]

    • [2023/08/11] BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents | [paper] | [code]

    • [2023/08/07] AgentBench: Evaluating LLMs as Agents | [paper] | [code]

    • [2023/07/31] HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution | [paper] | [code]

    • [2023/06/09] Mind2Web: Towards a Generalist Agent for the Web | [paper] | [code]


  • Tool Usage&Human-Agent Interaction
    • [2024/01/03] GPT-4V(ision) is a Generalist Web Agent, if Grounded | [paper] | [code]

    • [2023/12/21] Team Flow at DRC2023: Building Common Ground and Text-based Turn-taking in a Travel Agent Spoken Dialogue System | [paper] | [code]

    • [2023/12/21] AppAgent: Multimodal Agents as Smartphone Users | [paper] | [code]

    • [2023/12/14] CogAgent: A Visual Language Model for GUI Agents | [paper] | [code]

    • [2023/11/19] TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems | [paper] | [code]

    • [2023/10/18] MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models | [paper] | [code]

    • [2023/10/13] AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems | [paper] | [code]

    • [2023/10/12] A Zero-Shot Language Agent for Computer Control with Structured Reflection | [paper] | [code]

    • [2023/09/02] ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models | [paper] | [code]

    • [2023/08/07] TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents | [paper] | [code]

    • [2023/06/05] When Large Language Model based Agent Meets User Behavior Analysis: A Novel User Simulation Paradigm | [paper] | [code]


⭐ Star History

Star History Chart