Task-driven Embodied Agents that Chat
Aishwarya Padmakumar*, Jesse Thomason*, Ayush Shrivastava, Patrick Lange, Anjali Narayan-Chen, Spandana Gella, Robinson Piramuthu, Gokhan Tur, Dilek Hakkani-Tur
TEACh is a dataset of human-human interactive dialogues to complete tasks in a simulated household environment.
- python3
>=3.7,<=3.8
- python3.x-dev, example:
sudo apt install python3.8-dev
- tmux, example:
sudo apt install tmux
- xorg, example:
sudo apt install xorg openbox
- ffmpeg, example:
sudo apt install ffmpeg
pip install -r requirements.txt
pip install -e .
Run the following script:
teach_download
This will download and extract the archive files (experiment_games.tar.gz
, all_games.tar.gz
,
images_and_states.tar.gz
, edh_instances.tar.gz
& tfd_instances.tar.gz
) in the default
directory (/tmp/teach-dataset
).
Optional arguments:
-d
/directory
: The location to store the dataset into. Default=/tmp/teach-dataset
.-se
/--skip-extract
: If set, skip extracting archive files.-sd
/--skip-download
: If set, skip downloading archive files.-f
/--file
: Specify the file name to be retrieved from S3 bucket.
File changes (12/28/2022):
We have modified EDH instances so that the state changes checked for to evaluate success are only those that contribute towards task success in the main task of the gameplay session the EDH instance is created from.
We have removed EDH instances that had no state changes meeting these requirements.
Additionally, two game files, and their corresponding EDH and TfD instances were deleted from the valid_unseen
split due to issues in the game files.
Version 3 of our paper on Arxiv, which will be public on Dec 30, 2022 contains the updated dataset size and experimental results.
If running on a remote server without a display, the following setup will be needed to run episode replay, model inference of any model training that invokes the simulator (student forcing / RL).
Start an X-server
tmux
sudo python ./bin/startx.py
Exit the tmux
session (CTRL+B, D
). Any other commands should be run in the main terminal / different sessions.
Download Teach Data
Get Teach Data that is formatted for Alfred from Vardaan
For this setup the dataset directory is outside of this directory (i.e, ../teach-dataset)
For this setup the dataset directory is outside of this directory (i.e, ../hlsm_formatted_edh_format)
For later: reduce the dataset redundancy between ../teach-dataset and ../hlsm_formatted_edh_format
source init.sh
source env_setup.sh
source venv/teach-hlsm/bin/activate
bash hlsm_teach_without_docker.sh
The evaluation metrics will be in $HOST_OUTPUT_DIR/$SUBMISSION_PK/metrics_file
.
Images for each episode will be in $HOST_IMAGES_DIR/$SUBMISSION_PK
.