Reinforcement Learning with Model-Agnostic Meta-Learning (MAML) in TensorFlow 2 (TF2)

Implementation of Model-Agnostic Meta-Learning (MAML) applied on Reinforcement Learning problems in TensorFlow 2.

This repo is heavily inspired by the original implementation cbfinn/maml_rl (TensorFlow 1.x) as well as the fantastic implementations of Tristan Deleu (tristandeleu/pytorch-maml-rl (PyTorch)) and Jonas Rothfuss (jonasrothfuss/ProMP (TensorFlow 1.x)). I totally recommend to check out all three implementations too.

The original MAML algorithm uses TRPO as optimization method and so far this is also integrated in this version. Tests with 2DNavigation-v0 and HalfCheetahDir-v1 environments yield the same results as the original paper. Better TF2 graph support with tf.function and more variations of MAML (i.e. CAVIA, ProMP, etc.) might be added soon.

Usage

You can use the main.py script in order to train the algorithm with MAML.

python main.py --env-name 2DNavigation-v0 --num-workers 20 --fast-lr 0.1 --max-kl 0.01 --fast-batch-size 20 --meta-batch-size 40 --num-layers 2 --hidden-size 100 --num-batches 500 --gamma 0.99 --tau 1.0 --cg-damping 1e-5 --ls-max-steps 15

To evaluate the trained agent, just run

python experiments.py

Both scripts were tested with Python 3.6.

References

This project is, for the most part, a reproduction of the original implementation cbfinn/maml_rl in TensorFlow 2. The experiments are based on the paper

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. International Conference on Machine Learning (ICML), 2017 [ArXiv]

If you want to cite this paper

@article{DBLP:journals/corr/FinnAL17,
  author    = {Chelsea Finn and Pieter Abbeel and Sergey Levine},
  title     = {Model-{A}gnostic {M}eta-{L}earning for {F}ast {A}daptation of {D}eep {N}etworks},
  journal   = {International Conference on Machine Learning (ICML)},
  year      = {2017},
  url       = {http://arxiv.org/abs/1703.03400}
}

schneimo/maml-rl-tf2

Reinforcement Learning with Model-Agnostic Meta-Learning (MAML) in TensorFlow 2 (TF2)

Usage

References