/hanna

Visual Navigation with Natural Multimodal Assistance (EMNLP 2019)

Primary LanguageC++OtherNOASSERTION

HANNA: Visual Navigation with Natural Multimodal Assistance

License: MIT

EMNLP'19 Paper: Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning

Authors: Khanh Nguyen, Hal Daumé III

UPDATE Oct 15, 2019: fix a bug in the validation code that prevented the code from reproducing results in the paper.

What is HANNA?

HANNA is an interactive photo-realistic simulator that mimics an agent fulfilling object-finding tasks by leveraging natural language-and-vision assistance.

IMAGE ALT TEXT HERE An example HANNA task.

How is HANNA different from other visual navigation tasks?

IMAGE ALT TEXT HERE

Comparing HANNA with VLN (Anderson et al., 2018b), EQA (Wijmans et al., 2019), VNLA (Nguyen et al., 2019), CVDN (Thomason et al., 2019).

Let's play with HANNA!

  1. git clone --recursive https://github.com/khanhptnk/hanna.git (don't forget the recursive flag!)
  2. Download data.
  3. Setup simulator.
  4. Run experiments.

Citation

If you use the code or data in this repo, please cite our paper using the following bibtex code

@inproceedings{nguyen2019hanna,
  author = {Nguyen, Khanh and Daum{\'e} III, Hal},
  title = {Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning},
  booktitle = {Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  month = {November},
  year = {2019},
}