/relax_a2c_example

Example A2C implementation with ReLAx

Primary LanguageJupyter Notebook

Example A2C implementation with ReLAx

This repository contains an implementation of advantage actor critic (A2C) with ReLAx.

A2C actor was trained on LunarLander-v2 Gym environment for 4m env-steps.

The graph of average return vs training step is shown below (batch_size=40000):

a2c_training

Resulting Policy:

a2c_run.mp4