Implemented policies:

Work-in-progress:

lenn: policy that increases the input dimension with Legendre polynomials
mlp-max: taking the ideas from snp, a mlp policy that increases the input dimension with a max operator
polynomial: A polynomial policy with input normalization.
linear-ensemble: linear policies are combined through a weighted sum (a la bagging)
linear-residual-policy: A "leader" policy plus additional helper policies (work in progress...)

Augmented Random Search (ARS)

ARS is a random search method for training linear policies for continuous control problems, based on the paper "Simple random search provides a competitive approach to reinforcement learning."

Prerequisites for running ARS

Our ARS implementation relies on Python 3, OpenAI Gym, DM Control, and the Ray library for parallel computing.

To install DM Control and MuJoCo dependencies follow the instructions here: https://github.com/deepmind/dm_control

To install Ray execute:

pip install ray

We recommend using single threaded linear algebra computations by setting:

export MKL_NUM_THREADS=1

To train a policy for the "walker" domain with a "walk" task, execute the following command:

python code/ars.py

To render a trained policy, execute a command of the following form:

python code/run_ars_policy.py

Please note that movie-py is needed to build the gif