Tactile-Gym: RL suite for tactile robotics

Project Website • Arxiv

This repo provides a suite of PyBullet reinforcement learning environments targeted towards using tactile data as the main form of observation.

Installation
Testing Environments
Training Agents
Pretrained Agents
Environment Details
Observation Details
Alternate Robot Arms
Additional Info

Installation

This repo has only been developed and tested with Ubuntu 18.04 and python 3.8.

# TODO: install via pypi
git clone https://github.com/ac-93/tactile_gym.git
cd tactile_gym
python setup.py install

Testing Environments

Demonstration files are provided for all environments in the example directory. For example, from the base directory run

python examples/demo_example_env.py

to run a user controllable example environment.

Training Agents

The environments use the OpenAI Gym interface so should be compatible with most reinforcement learning librarys.

We use stable-baselines3 for all training, helper scripts are provided in tactile_gym/sb3_helpers/

A simple experiment can be run with simple_sb3_example.py, a full training script can be run with train_agent.py. Experiment hyper-params are in the parameters directory.

Training with image augmentations: If intending to use image augmentations for training, as done in the paper, then this fork of sb3 contrib is required. (TODO: Contribute this to sb3_contrib).

Pretrained Agents

Example PPO/RAD_PPO agents, trained via SB3 are provided for all environments and all observation spaces. These can be downloaded here and placed in tactile_gym/examples/enjoy.

In order to demonstrate a pretrained agent from the base directory run

python examples/demo_trained_agent.py -env='env_name' -obs='obs_type' -algo='algo_name'

Environment Details

Environment Name	Description
`edge_follow-v0`	A flat edge is randomly orientated through 360 degrees and placed within the environment. The sensor is initialised to contact a random level of pentration at the start of the edge. The objective is to traverse the edge to a goal at the oposing end whilst maintaining that the edge is located centrally on the sensor.
`surface_follow-v0`	A terrain like surface is generated through OpenSimplex Noise. The sensor is initialised in the center, touching the surface. A goal is randomly placed towards the edges of the surface. The objective is to maintain a normal orientation to the surface and a set penetration distance whilst the sensor is automatically moved towards the goal.
`surface_follow-v1`	Same as `-v0` however the goal location is included in the observation and the agent must additionally learn to traverse towards the goal.
`object_roll-v0`	A small spherical object of random size is placed on the table. A flat tactile sensor is initialised to touch the object at a random location relative to the sensor. A goal location is generated in the sensor frame. The objective is to manipulate the object to the goal location.
`object_push-v0`	A cube object is placed on the table and the sensor is initialised to touch the object (in a right-angle configuration). A trajectory of points is generated through OpenSimplex Noise. The objective is to push the object along the trajectory, when the current target point has been reached it is incremented along the trajectory until no points are left.
`object_balance-v0`	Similar to a 2d CartPole environment. An unstable pole object is balanced on the tip of a sensor pointing upwards. A random force pertubation is applied to the object to cause instability. The objective is to learn planar actions to counteract the rotation of the object and mantain its balanced position.

Observation Details

All environments contain 4 main modes of observation:

Observation Type	Description
`oracle`	Comprises ideal state information from the simulator, which is difficult information to collect in the real world, we use this to give baseline performance for a task. The information in this state varies between environments but commonly includes TCP pose, TCP velocity, goal locations and the current state of the environment. This observation requires signifcantly less compute both to generate data and for training agent networks.
`tactile`	Comprises images (default 128x128) retrieved from the simulated optical tactile sensor attached to the end effector of the robot arm (Env Figures right). Where tactile information alone is not sufficient to solve a task, this observation can be extended with oracle information retrieved from the simulator. This should only include information that could be be easily and accurately captured in the real world, such as the TCP pose that is available on industrial robotic arms and the goal pose.
`visual`	Comprises RGB images (default 128x128) retrieved from a static, simulated camera viewing the environment (Env Figures left). Currently, only a single camera is used, although this could be extended to multiple cameras.
`visuotactile`	Combines the RGB visual and tactile image observations to into a 4-channel RGBT image. This case demonstrates a simple method of multi-modal sensing.

When additional information is required to solve a task, such as goal locations, appending _and_feature to the observation name will return the complete observation.

Preliminary Support for Alternate Robot Arms

The majority of testing is done on the simulated UR5 robot arm. The Franka Emika Panda and Kuka LBR iiwa robot arms are additionally provided however there may be bugs when using these arms. Particularly, workframes may need to be adjusted to ensure that arms can comfortably reach all the neccessary configurations. These arms can be used by changing the self.arm_type flag within the code.

Bibtex

@misc{church2021optical,
      title={Optical Tactile Sim-to-Real Policy Transfer via Real-to-Sim Tactile Image Translation},
      author={Alex Church and John Lloyd and Raia Hadsell and Nathan F. Lepora},
      year={2021},
      eprint={2106.08796},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

mrzhuzhe/tactile_gym