Robot manipulation task data for offline reinforcement learning (Offline RL) especially for the hierarchical object manipulation system (HOMS)
HOMS is a system for performing multi-tasks procedurally.
source code link: https://github.com/SunInKim/Hierarchical-Obejct-Manipulation-System-HOMS
The dataset is composed of a dataset for high-level policy (task classifier) and a dataset for low-level policy (robot controller).
The dataset is collected using pybullet simulator.
The scripted policy can be found in env.util.task_policy.py
The dataset can be collected by execute the code
python get_rollout.py
Each rollout contains 100 episodes of unit task.
rollout
--episode
--observations
--actions
--rewards
--next_observations
--terminals
--possible_actions
observations: # goal_image+current_image (6, 240, 240)
actions: # num_task (11,)
rewards: # get 1 when possible action is None (1,)
next_observations: # goal_image+current_image (6, 240, 240)
terminals: # get 1 when possible action is None (1,)
possible_actions: # N is num_possible_action (N,)
rollout
--episode
--observations
--images
--robot_state
--actions
--rewards
--next_observations
--terminals
--tasks
observations: # current_image (3, 240, 240), robot_state(x,y,z,yaw,grippersttate) (6,)
actions: # robot_action (x,y,z,yaw,gripper,task_terminal) (6,)
rewards: # get 1 for completing the task and returning to the initial position. (1,)
next_observations: # current_image (3, 240, 240), robot_state (6,), task_id (10,)
terminals: # get done for completing the task and returning to the initial position. (1,)
tasks: # one-hot vector which indicates the type of task (10,)