
Paper list for reasoning and agents


2024 Jan

Analyze/Optimize Reflection

  • AutoPlan: don't use demonstrations from human, collect feedback from the environment and generate reflections

2023 Dec

Improve Planning abilities using RL

  • RetroFormer: freezes the base LLM and trains reinforcement learning models to refine reflections through policy gradient methods
  • ADAPTING LLM AGENTS THROUGH COMMUNICATION : applies PPO training directly to an open-source LLM based on feedback and agent exploration trajectories

2023 Dec

Analyze/Optimize reflection

2023 Nov

Agent framework

  • Reflexion: uses reflection to improve the performance, uses oracle to determine whether the reasoning should stop
  • RAP: uses MCTS, uses LLM as world model
  • RATS: combines MCTS, reward evaluation and reflection, in the LLM search process

2023 Oct

Improve COT capabilities

  • []

2023 Oct

Open Domain QA Datasets

2023 Oct

Open Domain QA methods

2023 Sep

Tool Use