/HOMS-dataset

Dataset for the hierarchical object manipulation system

HOMS-dataset

Robot manipulation task data for offline reinforcement learning (Offline RL) especially for the hierarchical object manipulation system (HOMS)

HOMS is a system for performing multi-tasks procedurally.

source code link: https://github.com/SunInKim/Hierarchical-Obejct-Manipulation-System-HOMS

The dataset is composed of a dataset for high-level policy (task classifier) and a dataset for low-level policy (robot controller).


re_Task_list


data link: https://koreaoffice-my.sharepoint.com/:f:/g/personal/jhj0630_korea_edu/EnGP9hFKsbVKmU1s_lQlDIAB2zMaeqnFRuXicRc84Mdcnw?e=klpUEj

Collection method

The dataset is collected using pybullet simulator.

The scripted policy can be found in env.util.task_policy.py

The dataset can be collected by execute the code

python get_rollout.py

Description of the dataset

Each rollout contains 100 episodes of unit task.

High policy (Task classifier)

Structure

rollout
--episode
  --observations
  --actions
  --rewards
  --next_observations
  --terminals
  --possible_actions

Data shape

observations: # goal_image+current_image (6, 240, 240) 
actions: # num_task (11,) 
rewards: # get 1 when possible action is None (1,)
next_observations: # goal_image+current_image (6, 240, 240)
terminals: # get 1 when possible action is None (1,)
possible_actions: # N is num_possible_action  (N,)

Low policy (Robot controller)

Structure

rollout
--episode
  --observations 
    --images
    --robot_state
  --actions 
  --rewards 
  --next_observations 
  --terminals 
  --tasks 

Data shape

observations: # current_image (3, 240, 240), robot_state(x,y,z,yaw,grippersttate) (6,)
actions: # robot_action (x,y,z,yaw,gripper,task_terminal) (6,)
rewards: # get 1 for completing the task and returning to the initial position.  (1,)
next_observations: # current_image (3, 240, 240), robot_state (6,), task_id (10,)
terminals: # get done for completing the task and returning to the initial position.  (1,)
tasks: # one-hot vector which indicates the type of task (10,)

Examples of transitions of each module to reach the goal state

Goal state

re_goal_state

Transition for task classifier

re_task_select

Transition for robot controller

re_robot_control