This is a python package that provides a robotic planning environment with an interface that is similar to OpenAI gym.
The task is navigation of 2d large robots in tight spaces.
env = RandomMiniEnv()
obs = env.reset()
while not done:
action = planner.plan(obs)
observation, reward, done, info = env.step(action)
Here's a motion planning challange for you!
To get started, please see run_the_challange.py
script. In there you will see a motion planning experiment implemented
in the "planning env" framework.
The task is to implement an algorithm with an interface similiar
to SimpleActor
that is, one that has .act
function that maps
Observationto
Action`. Its job is to steer the robot from where it is currently
to to the end of the path.
Note that this is not "full black box RL" set up. Specifically, things that are OK to do include:
- using explicit information about forward model of the robot (i.e. assume knowledge of the movement equations and their parameters)
- disabling noise of the system and implementing a solution to such a simplified problem first
Nevertheless, in case someone wants to use RL, we have included code that provides rewards for following sub-segments of the wanted path.
Here's a visualization of our algorithm solving this challange in real-time.
You can save and restore the full state of the environment. This is useful for example for Monte Carlo simulation, where you need to run many rollouts from one state.
The syntax works as follows
state = env.get_state()
# do sth with the env, try out some plan
while sth():
env.act(some_action)
# restore the env to the previous state
env.set_state(state)
All pieces of the framework can be rendered to basic python types
(int
, float
, dict
, the most complicated numpy.ndarray
).
What is more, the objects can be constructed back from this representation
in a completely idempotent way.
This way you can use pickle to save / load whatever you want.
Observation
fromenvs.base.obs
represents the type of observation that is returned from the environment.Action
fromenvs.base.action
represents the action that is passed for execution to the environmentPlanEnv
fromenvs.base.env
is the class that represents the environment itself.State
fromenvs.base.env
is the class that represents the state of the environmentContinuousRewardProvider
fromenvs.base.reward
is a class that interprets what robot has done in the environment and assigns rewars for it
Please see the scripts in scripts
for examples how to run the environment.
The base class is envs.base.env.PlanEnv
You need to supply a path to follow and costmap that represents
obstacles.
An example of this is given in the script
scripts.env_runners.rw_randomized_corridor_3_boxes
,
where we load a custom costmap and path based on percepts from a real
robot.
There are additional classes that supply these path and costmaps in special ways:
envs.mini_env.RandomMiniEnv
- randomized, synthetic 'parallel parking' small square environment with one obstacle, where you have to reach next pose that can be close to you, but can have an ankward path to it.envs.synth_turn_env.AisleTurnEnv
- a randomized synthetic environment where you have to follow a path turnning into an aisle.
A frequently asked question we get is
why not just subclass
gym.Env
? This is becausegym
depends onscipy
. At Brain Corp, we choose not to usescipy
.
As far as we can see from the code, OpenAI gym
is on the road
to remove this dependency. Hopefully then we will subclass
it fully.