Primary LanguagePythonApache License 2.0Apache-2.0

Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph

This is the source code for our HRL-GRG model and the baseline methods we mentioned in the paper.

paper | project webpage


Our code is developed and tested under the following dependencies:

  • python==2.7.15
  • scipy==1.2.0
  • numpy==1.15.4
  • tensorflow==1.6.0
  • tf Slim
  • opencv==3.2.0-dev

Before running the code, please specify the path to the code directory in the config.json that can be found in both the grid_world, the robotic_object_search/House3D and the robotic_object_search/AI2-THOR directory.

Before running the robotic object search code on AI2-THOR, please download our pre-processed data sourced from AI2-THOR and extract at the robotic_object_search/AI2-THOR directory.

Before running the robotic object search code on House3D, please download our pre-processed data sourced from House3D and extract at the robotic_object_search/House3D directory.

Download our pre-trained models and put them in the corresponding code directories for training and/or evaluating our method.


Grid-world domain

To train our model HRL-GRG in the paper, run this command:

# From grid-world/HRL-GRG/

To train other baseline methods mentioned in the paper, run the same command from the corresponding directories.

Robotic Object Search


To train our model HRL-GRG in the paper, run this command:

# Specify the parameters in robotic_object_search/AI2-THOR/HRL-GRG/train.sh, 
# and from robotic_object_search/AI2-THOR/HRL-GRG/


# From robotic_object_search/AI2-THOR/HRL-GRG/
python train.py \
    --pretrained_model_path=${PATH_TO_PRETRAINED_MODEL} \

where the pretrained_model_path is ../A3C/result_pretrain/model.


To train our model HRL-GRG in the paper, run this command:

# Specify the parameters in robotic_object_search/House3D/HRL-GRG/train.sh, 
# and from robotic_object_search/House3D/HRL-GRG/


# From robotic_object_search/House3D/HRL-GRG/
python train.py \
    --default_scenes=<enviroments_to_train> \
    --default_targets=<target_objects_to_train> \
    --pretrained_model_path=${PATH_TO_PRETRAINED_MODEL} \

where the pretrained_model_path is ../A3C/result_se_for_pretrain/model for the single environment setting, and ../A3C/result_me_for_pretrain/model for the multiple environments setting.

To train other baseline methods mentioned in the paper, run the same command from the corresponding directories.

Evaluation and Results

Grid-world domain

To evaluate our method HRL-GRG on the grid-world domain, and reproduce the results of our method on the unseen grid-world maps for seen goals as follows,

Unseen Envs Seen Goals SR AS / MS SPL
HRL-GRG 0.57 28.71 / 9.03 0.33

run this command:

# From grid_world/HRL-GRG/
CUDA_VISIBLE_DEVICES=-1 python evaluate.py \

To reproduce the results for the unseen goals and the overall goals, specify the evaluate_file as '../random_method/maps_16X16_v6_valid_unseengoals.txt' and '../random_method/maps_16X16_v6_valid_total.txt' respectively.

To evaluate other baseline methods, run the same command from the corresponding directories.

Robotic Object Search


To evaluate our method HRL-GRG for the robotic object search task on AI2-THOR, and reproduce the results of our method on the seen environments for seen goals as follows,

Seen Env Seen Goals SR SPL
HRL-GRG 0.74 0.34

run this command,

# From robotic_object_search/AI2-THOR/HRL-GRG/
CUDA_VISIBLE_DEVICES=-1 python evaluate.py \
  --model_path="result_pretrain/model" \

To reproduce the results for the seen environments unseen goals, unseen environments seen goals and unseen environments unseen goals, specify the evaluate_file as '../random_method/ssuo.txt', '../random_method/usso.txt' and '../random_method/usuo.txt' respectively.

To evaluate the corresponding Random method, run the following command with the evaluate_file being specified respectively.

# From robotic_object_search/AI2-THOR/random_method/
CUDA_VISIBLE_DEVICES=-1 python random_walk.py \


To evaluate our method HRL-GRG for the robotic object search task on House3D,

  • run the command,
# From robotic_object_search/House3D/HRL-GRG/
CUDA_VISIBLE_DEVICES=-1 python evaluate.py \
  --model_path="result_se_pretrain/model" \

to reproduce the results of our method on the single environment for the seen goals as follows,

Single Env Seen Goals SR SPL
HRL-GRG 0.88 0.33
  • run the command,
# From robotic_object_search/House3D/HRL-GRG/
CUDA_VISIBLE_DEVICES=-1 python evaluate.py \
  --model_path="result_se_pretrain/model" \

to reproduce the results of our method on the single environment for the unseen goals as follows,

Single Env Unseen Goals SR SPL
HRL-GRG 0.79 0.21
  • run the command,
# From robotic_object_search/House3D/HRL-GRG/
CUDA_VISIBLE_DEVICES=-1 python evaluate.py \
  --model_path="result_me_pretrain/model" \

to reproduce the results of our method on the multiple environments for the seen environments as follows,

Multiple Envs Seen Envs SR SPL
HRL-GRG 0.76 0.20
  • run the command,
# From robotic_object_search/House3D/HRL-GRG/
CUDA_VISIBLE_DEVICES=-1 python evaluate.py \
  --model_path="result_me_pretrain/model" \

to reproduce the results of our method on the multiple environments for the unseen environments as follows,

Multiple Envs Unseen Envs SR SPL
HRL-GRG 0.62 0.10

To evaluate other baseline methods, run the same command from the corresponding directories.