Robot open-Ended Autonomous Learning Starter Kit

Instructions to make submissions to the Robot open-Ended Autonomous Learning 2020 competition.

Participants will have to submit their controllers, with packaging specifications, and the evaluator will automatically build a docker image and execute their controllers in two phases: an Intrinsic Phase and an Extrinsic Phase. The Intrinsic Phase will only be run for the Final Evaluation. During Round 1 and Round 2 only the Extrinsic Phase will be run on the evaluator: participants will run the Intrinsic Phase and make their controller learn locally.

Dependencies

Anaconda (By following instructions here) At least version 4.5.11 is required to correctly populate environment.yml.
real-robots PyPI

Setup

Clone the repository

git clone git@github.com:AIcrowd/REAL2020_starter_kit.git
cd REAL2020_starter_kit

Create a conda environment from the provided environment.yml

conda env create -f environment.yml

Activate the conda environment and install your code specific dependencies

conda activate real_robots
# If say you want to install PyTorch
# conda install pytorch torchvision -c pytorch
#
# or you can even use pip to install any additional packages
# for example : 
# pip install -U real-robots
# which updates the real-robots package to the latest version

Test Submission Locally

Test locally by running:

python local_evaluation.py

You can edit local_evaluation.py to run either the intrinsic or the extrinsic phase (or both) and adjust their duration and other parameters (i.e. number of objects) for testing purposes. Regardless of the modification, when the solution will be evaluated, the intrinsic and extrinsic phases will be run for the duration and number of trials defined in the Rules. During Round 1 and Round 2 it is expected that the controller has already learned during a 15M timestep Intrinsic Phase run locally by the participants.

(optional) build docker image locally and run docker container

pip install -U aicrowd-repo2docker
# This also expects that you have Docker, Nvidia-Docker installed

./debug.sh

How do I specify my software runtime ?

The software runtime is specified by exporting your conda env to the root of your repository by doing :

# The included environment.yml is generated by the command below, and you do not need to run it again
# if you did not add any custom dependencies

conda env export --no-build > environment.yml

# Note the `--no-build` flag, which is important if you want your anaconda env to be replicable across all

This environment.yml file will be used to recreate the conda environment inside the Docker container. This repository includes an example environment.yml

You can specify your software environment by using all the available configuration options of repo2docker. (But please remember to use aicrowd-repo2docker to have GPU support)

How can I connect my system to the environment and evaluate it?

You can use local_evaluation.py to evaluate your system on your computer. As a default local_evaluation.py uses as the robot controller the one defined in my_controller.py. In this latter file you will find a random controller and also a reference to the Baseline algorithm. You can change the local evaluation parameters to change many parameters of the environment, such as the length of the intrinsic and extrinsic phase, the action type used by the controller or the number of objects in the environment.

Evaluation function parameters

Controller: class
An example controller which should expose a step function and be a subclass of BasePolicy
environment: string
"R1" or "R2", which represent Round1 and Round2 of the competition respectively
action_type: string
"cartesian", "joints" or "macro_action" (see parameter description below)
n_objects: int number of objects on the table: 1, 2 or 3
intrinsic_timesteps: int
Number of timesteps in the Intrinsic phase (default 15e6)
extrinsic_timesteps: int
Number of timesteps in the Extrinsic phase (default 10e3)
extrinsic_trials: int
Total number of trials in the extrinsic phase (default 50)
visualize: bool
Boolean flag which enables or disables the visualization GUI when running the evaluation
goals_dataset_path: str
Path to a goals dataset

Parameters description

Actions type:

'macro_action': Numpy.ndarray([(x1,y1),(x2,y2)]), where (x1,y1) is the start point and (x2,y2) is the end point of the trajectory along the table.
'cartesian': Numpy.ndarray([x,y,z,o1,o2,o3,o4]), where x,y,z is the desired position and o1,o2,o3,o4 is the desired orientation (quaternion) of the gripper.
'joints': Numpy.ndarray([x1,x2,x3,x4,x5,x6,x7,x8,x9]), where each value represent the desidered angle position for each joint

Controller class instance: It has to be a subclass of BasePolicy, see policy.py.
Basepolicy has several methods that are called each at the start or the end of intrinsic and extrinsic phases and at the beginning and at the end of each trial. It has also a step method that receives the current observation from the environment and it should return the action to do next:

step:
- input: observation, reward, done
  - observation is a dictionary with several keys. See also environment.md.
    - joint_positions: array with the current joint angle positions
    - touch_sensors: array with touch sensor readings
    - retina: RGB image (dimension: 240x320x3) of the table viewed from above (robot arm is also shown)
    - goal: RGB image of the goal state
    If the environment is "R1" (during Round 1), these additional observations are also provided in the same dictionary:
    - object_positions: a dictionary with a key for each object on the table with associated position and orientation of the object
    - goal_positions: a dictionary with the goal position of each object
    - mask: a segmentation mask of the retina image where for each pixel there is an integer index that identifies which object is in that pixel (i.e. -1 is a background pixel, 0 is the robot, 1 is the table, etc)
    - goal_mask: a segmentation mask of the goal image
  - reward is not used and is always 0
  - done will be True when the intrisic phase ends or an extrinsic trial ends, otherwise it will always be False.
- output: action
  - action is a dictionary with two keys:
    - action type: action
    - render: boolean
      The evaluate class passes the dictionary (example: {'macro_action': Numpy.ndarray([ [0.1 , 0.3] , [0.2 , 0.3] ]), 'render': True}) to the environment that will execute the specified action and then return a new observation.

How can I use the simplifications?

Actions space reduction:

'macro_action': it allows to reduce the action space from the 9-dimensional joints space to a four-dimensional space, where the four values (x1,y1,x2,y2) represent a trajectory on the table that starts from (x1,y1) and ends to (x2,y2).
'cartesian': it allows to move the arm in a seven-dimensional space, where the seven values (x,y,z,o1,o2,o3,o4) represent the three-dimensional point in the space plus the gripper desidered orientation. When sending the action to the environment, the action is composed by a 'cartesian_command' with those seven values plus a 'gripper_command' to use the two gripper joints (opening/closing the gripper)..

Gripper closed:

when using joints control, keep the last two joints to 0.
when using cartesian control, send 'gripper_command' with both joints to 0.
when using the macro_action control, the gripper is always closed.

Fixed wrist:

when using joints control, this simplification is not available.
when using cartesian control, the desired orientation can be fixed to [√2/2, √2/2, 0, 0]

fixedOrientation = [2**(0.5)/2, 2**(0.5)/2, 0, 0]

when using the macro_action control, the wrist is fixed downwards by default.

Abstraction simplifications contained in the observations:

coordinates: it allows to reduce from the image space (320x240x3) to a seven-dimensional space, where the seven points (x,y,z,o1,o2,o3,o4) represent the physics three dimensional space and the other points represent the object orientation.
masks: it allows to reduce from images space to a filtered images space, where a mask is an image that for each pixel has a integer number that let you know which object is in that pixel. (example: a cell with -1 represent the background pixel)

What should my code structure be like ?

Please follow the structure documented in the included my_controller.py to adapt your already existing code to the required structure for this round.

Baseline

The Baseline subfolder contains an implementation of a solution at the challenge that can be used to take confidence with the simulator and with the challenge in general. So we suggest of follow these steps:

Download the code
Download the intrinsic phase file (transitions_file.zip) made available and described in the subfolder baseline
Start a simulation with the active display (visualize = True in local_evaluation.py) to see how the system acts in the extrinsic phase
Explore the description of the entire system and start editing the local code to see the effect of the changes
Improve the baseline by making changes in one or more of the modules (Exploration, Abstraction, Planning)
(Optional) Create your own new system from scratch.

Important Concepts

Repository Structure

aicrowd.json Each repository should have a aicrowd.json with the following content :

{
  "challenge_id": "goal_real_robots_challenge_2020",
  "grader_id": "goal_real_robots_challenge_2020",
  "authors": ["mohanty"],
  "description": "Robot open-Ended Autonomous Learning 2020 (REAL2020) Challenge.",
  "license" : "MIT",
  "debug": true
}

This is used to map your submission to the said challenge, so please remember to use the correct challenge_id and grader_id as specified above.

If you set debug to true, then the evaluation will run a reduced number of timesteps, and the logs from your submitted code (if it fails), will be made available to you to help you debug. NOTE : IMPORTANT : By default we have set debug:true, so when you have done the basic integration testing of your code, and are ready to make a final submission, please do make sure to set debug to false in aicrowd.json.

my_controller.py The task is to implement your own controller my following the template provided in my_controller.py. The my_controller.py file should finally reference the implemented class as SubmittedPolicy

Submission

To make a submission, you will have to create a private repository on https://gitlab.aicrowd.com/.

You will have to add your SSH Keys to your GitLab account by following the instructions here. If you do not have SSH Keys, you will first need to generate one.

Then you can create a submission by making a tag push to your repository on https://gitlab.aicrowd.com/. Any tag push (where the tag name begins with "submission-") to your private repository is considered as a submission
Then you can add the correct git remote, and finally submit by doing :

cd REAL2020_starter_kit
# Add AIcrowd git remote endpoint
git remote add aicrowd git@gitlab.aicrowd.com:<YOUR_AICROWD_USER_NAME>/REAL2020_starter_kit.git
git push aicrowd master

# Create a tag for your submission and push
git tag -am "submission-v0.1" submission-v0.1
git push aicrowd master
git push aicrowd submission-v0.1

# Note : If the contents of your repository (latest commit hash) does not change,
# then pushing a new tag will **not** trigger a new evaluation.

You now should be able to see the details of your submission at : https://gitlab.aicrowd.com/<YOUR_AICROWD_USER_NAME>/REAL2020_starter_kit/issues

NOTE: Remember to update your username in the link above 😉

In the link above, you should start seeing something like this take shape (the whole evaluation can take a bit of time, so please be a bit patient too 😉 ) :

Best of Luck 🎉 🎉

Author

Sharada Mohanty
Emilio Cartoni
Davide Montella

isgeles/REAL2020_starter_kit