ParaGon: Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement
Project Page: 1989ryan.github.io/projects/paragon.html
This repository contains the pytorch implementation of the paper: Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement.
You are highly recommended to use Docker to run the code.
Install nvidia-docker
Build docker container
python3 scripts/docker_build.py
Run docker container
python3 scripts/docker_run.py
You will need to have 269G free space to get all the data.
python3 scripts/get_dataset.py
You can also choose to modify the script scripts/get_dataset.py
to download testing data only (44G) if you do not have enough space.
python3 scripts/pretrain_model.py
bash scripts/run_pretrain.sh
bash scripts/train.sh
bash scripts/eval.sh
If you find this work useful in your research, please cite:
@InProceedings{zhao2023paragon,
author = {Zhao, Zirui and Lee, Wee Sun and Hsu, David},
title = {Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement},
booktitle = {Proceedings of the IEEE International Conference on Robotics and Automation},
year = {2023}
}