This is the PyTorch implementation of our paper:
Curriculum Learning for Vision-and-Language Navigation [arxiv]
Jiwen Zhang, Zhongyu Wei, Jianqing Fan, Jiajie Peng
35th Conference on Neural Information Processing Systems (NeurIPS 2021)
- 2021-12-27: We upload the
tasks/R2R-judy/main.py
and training instructions. - 2021-11-16: We have our paper arxived, now you can acess it by clicking here !
- 2021-11-14: We update package of agents and methods. (
tasks/R2R-judy/src
) - 2021-11-08: We update the installation instructions.
- 2021-11-06: We uploaded the CLR2R dataset mentioned in our paper. (
tasks/R2R-judy/data
)
This repository includes several SOTA navigation agents previously released. They are
- Follower agent (from University of California, Berkeley, Carnegie Mellon University and Boston University) released with paper Speaker-Follower Models for Vision-and-Language Navigation, by Fried, Daniel, Ronghang Hu, Volkan Cirik, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein and Trevor Darrell. NeurIPS(2018).
- Self-Monitoring agent (from Georgia Institute of Technology, University of Maryland and Salesforce Research) released with paper Self-Monitoring Navigation Agent via Auxiliary Progress Estimation, by Ma, Chih-Yao, Jiasen Lu, Zuxuan Wu, Ghassan Al-Regib, Zsolt Kira, Richard Socher and Caiming Xiong. ICLR(2019).
- EnvDrop agent (from UNC Chapel Hill) released with paper Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout, by Tan, Hao, Licheng Yu and Mohit Bansal. NAACL(2019).
and a path-instruction scorer
- VLN-BERT (from Georgia Institute of Technology, Facebook AI Research and Oregon State University) released with paper Improving Vision-and-Language Navigation with Image-Text Pairs from the Web, by Majumdar, Arjun, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh and Dhruv Batra. ECCV(2020).
-
Install Python 3.6 (Anaconda recommended: https://docs.anaconda.com/anaconda/install/index.html).
-
Install PyTorch following the instructions on https://pytorch.org/ (in our experiments, it isPyTorch 1.5.1+cu101).
-
Following build instructions in this github to build up a v0.1 Matterport3D simulator.
Besides, just in case you have an error when compiling the simulator, you can try this
mkdir build && cd build cmake -D CUDA_TOOLKIT_ROOT_DIR=path/to/yout/cuda make cd ../
For more details on the Matterport3D Simulator, you can refer to
README_Matterport3DSimulator.md
.
Luckily, this repository contains the R2R dataset and CLR2R dataset, so you ONLY have to download precomputing ResNet image features from Matterport3DSimulator.
Download and extract the tsv files into the img_features
directory. You will only need the ImageNet features to replicate our results.
Clone (or just download) this reposiroty and replace tasks
directory in original Matterport3D simulator with the one in this reposiroty.
After following the steps above the your file directory should look like this:
Matterport3D/
build/ # should be complied in your machine
cmake/
connectivity/ # store Json connecivity graphs for each scan
img_features/ # store precomputed image features, i.e. ResNet-152 features
include/
pybind11/ # a dependency of Matterport3D Simulator
...
tasks/R2R-judy/ # replace it with the one in this directory
...
To replicate the Table 3 in our paper, try the following command in shell.
CONFIG_PATH="path-to-config-file"
CL_MODE="" # "" / "NAIVE" / "SELF-PACE"
python tasks/R2R-judy/main.py \
--config-file $CONFIG_PATH \
TRAIN.DEVICE your_device_id \
TRAIN.CLMODE $CL_MODE \
...
You can refer to task/tasks/R2R-judy/runner
for more details.