/agent-workflow-memory

AWM: Agent Workflow Memory

Primary LanguagePythonApache License 2.0Apache-2.0

Agent Workflow Memory

arXiv PRs Welcome

Quickstart 💥

To run AWM on WebArena under webarena/:

cd webarena
python pipeline.py --website "shopping" # choose one from ['shopping', 'shopping_admin', 'reddit', 'gitlab', 'map']

To run AWM on Mind2Web under mind2web/:

cd mind2web
python pipeline.py --setup "offline" # or "online"

Check webarena/ and mind2web/ folders for more detailed instructions about environment and data setups.

What is Agent Workflow Memory? 🧠

Agent Workflow Memory (ATW) proposes to induce, integrate, and utilize workflows to the agent memory. A workflow is usually a common sub-routine in solving tasks, with example-specific contexts being abstracted out.

ATM can operate in both offline and online settings:

  • offline (left): when additional (e.g., training) examples are available, agents induce workflows from ground-truth annotated examples
  • online (right): without any auxiliary data, agents induce workflows from past experiences on the fly.

How does ATM work? 📈

On WebArena

We achieve the state-of-the-art result -- 35.6% success rate.

Check the code in ./webarena/ directory.

On Mind2Web

We also get the best scores among text-based agents. Particularly, ATM offline effectively generalize across a wide range of tasks, websites, and domains.

Check the code in ./mind2web/ directory.

Citation 📜

@inproceedings{awm2024wang,
  title = {Agent Workflow Memory},
  author = {Wang, Zhiruo anf Mao, Jiayuan, and Fried, Daniel and Neubig, Graham},
  journal={arXiv preprint arXiv:2409.07429},
  year = {2024},
}