Robot manipulation task data for offline reinforcement learning (Offline RL) especially for the hierarchical object manipulation system (HOMS)
HOMS is a system for performing multi-tasks procedurally.
source code link:
The dataset is composed of a dataset for high-level policy (task classifier) and a dataset for low-level policy (robot controller).
The dataset is collected using pybullet simulator.
The scripted policy can be found in
The dataset can be collected by execute the code
Each rollout contains 100 episodes of unit task.
observations: # goal_image+current_image (6, 240, 240)
actions: # num_task (11,)
rewards: # get 1 when possible action is None (1,)
next_observations: # goal_image+current_image (6, 240, 240)
terminals: # get 1 when possible action is None (1,)
possible_actions: # N is num_possible_action (N,)
observations: # current_image (3, 240, 240), robot_state(x,y,z,yaw,grippersttate) (6,)
actions: # robot_action (x,y,z,yaw,gripper,task_terminal) (6,)
rewards: # get 1 for completing the task and returning to the initial position. (1,)
next_observations: # current_image (3, 240, 240), robot_state (6,), task_id (10,)
terminals: # get done for completing the task and returning to the initial position. (1,)
tasks: # one-hot vector which indicates the type of task (10,)