🛒 WebShop

Implementation of the WebShop environment and search agents for the paper:

WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
Shunyu Yao*, Howard Chen*, John Yang, Karthik Narasimhan

This repository contains code for reproducing results. If you find this work useful in your research, please cite:

@inproceedings{yao2022webshop,
  bibtex_show = {true},
  title = {WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents},
  author = {Yao, Shunyu and Chen, Howard and Yang, John and Narasimhan, Karthik},
  booktitle = {ArXiv},
  year = {preprint},
  html = {https://arxiv.org/abs/2207.01206},
  tag = {NLP}
}

👋 Overview

WebShop is a simulated e-commerce website environment with 1.18 million real-world products and 12,087 crowd-sourced text instructions. In this environment, an agent needs to navigate multiple types of webpages and issue diverse actions to find, customize, and purchase a product given an instruction. WebShop provides several challenges including understanding compositional instructions, query (re-)formulation, dealing with noisy text in webpages, and performing strategic exploration.

Hugging Face Demo: Devise your own natural language query for a product and ask for an agent trained with WebShop to find it on Amazon or eBay, deployed as a 🤗 Hugging Face space here!

🚀 Setup

Our code is implemented in Python. To setup, do the following:

Install Python 3.8.13
Install Java
Download the source code:

> git clone https://github.com/princeton-nlp/webshop.git webshop

Create a virtual environment using Anaconda and activate it

> conda create -n webshop python=3.8.13
> conda activate webshop

Install requirements into the webshop virtual environment via the setup.sh script

> ./setup.sh [-d small|all]

The setup script performs several actions in the following order:

Installs Python dependencies listed in requirements.txt
Downloads product and instruction data for populating WebShop
Downloads spaCy en_core_web_lg model
Construct search engine index from product, instruction data
Downloads 50 randomly chosen trajectories generated by MTurk workers The -d flag argument allows you to specify whether you would like to pull the entire product + instruction data set (-d all) or a subset of 1000 random products (-d small).

By default the WebShop only loads 1,000 products for a faster environment preview. To load all products, change web_agent_site/utils.py:

# DEFAULT_ATTR_PATH = join(BASE_DIR, '../data/items_ins_v2_1000.json')
# DEFAULT_FILE_PATH = join(BASE_DIR, '../data/items_shuffle_1000.json')
DEFAULT_ATTR_PATH = join(BASE_DIR, '../data/items_ins_v2.json')
DEFAULT_FILE_PATH = join(BASE_DIR, '../data/items_shuffle.json')

(Optional) Download ResNet image feature files here and put into data/ for running models that require image features.
(Optional) Human demonstration data and be downloaded here.

🛠️ Usage

The WebShop environment can be rendered in two modes - html and simple - each of which offer a different observation space. The simple mode strips away the extraneous meta-data that the html mode includes to make model training and evaluation easier.

Webpage Environment (`html` mode)

Launch the WebShop webpage:

> ./run_dev.sh

The site should then be viewable in the browser. Go to http://localhost:3000/ABC, where you should land on the search home page with a random instruction.

Navigating the website will automatically generate a corresponding trajectory file in the user_session_logs/mturk folder. Each file corresponds to a single instruction/web session, and each step of the file corresponds to a single action (i.e. search[...], click[...]).

The current WebShop build comes with two flags:

--log: Include this flag to create a trajectory .jsonl log file of actions on WebShop
--attrs: Include this flag to display an Attributes tab on the item_page of WebShop

Text Environment (`simple` mode)

The simple mode of the WebShop environment is packaged and readily available as an OpenAI environment. The OpenAI gym definitions of the text environment can be found in the web_agent_site/envs folder.

To start using the gym and building agents that interact with the WebShop environment, include the following statements in your Python file:

import gym
from web_agent_site.envs import WebAgentTextEnv

env = gym.make('WebAgentTextEnv-v0', observation_mode='text', num_products=...)

Now, you can write your own agent that interacts with the environment via the standard OpenAI gym interface.

Examples of a RandomPolicy agent interacting with the WebShop environment in both html and simple mode can be found in the run_envs folder. To run these examples locally, run the run_web_agent_text_env.sh or run_web_agent_site_env.sh script:

> ./run_web_agent_text_env.sh
Products loaded.
Keys Cleaned.
Attributes Loaded.
100%|██████████████████| 1000/1000
Loaded 6910 goals.
Amazon Shopping Game [SEP] Instruction: [SEP] Find me slim f...
Available actions: {'has_search_bar': True, 'clickables': ['search']}
Taking action "search[shoes]" -> Reward = 0.0
...

In order to run the run_web_agent_site_env.sh script, you must download a version of ChromeDriver compatible with your Chrome browser version. Once you have downloaded and unzipped the executable, rename it chromedriver and place it in the webshop/envs folder.

Baseline Models

To run baseline models (rule, IL, RL, IL+RL) from the paper, please refer to the README.md in the baseline_models folder.

Sim-to-real Transfer

To read more about how the sim-to-real transfer of agents trained on WebShop to other environments works, please refer to the README.md in the transfer folder.

💫 Contributions

We would love to hear from the broader NLP and Machine Learning community, and we welcome any contributions, pull requests, or issues! To do so, please either file a new pull request or issue and fill in the corresponding templates accordingly. We'll be sure to follow up shortly!

🪪 License

Check LICENSE.md

princeton-nlp/WebShop