During training of a RL-Agent we follow the gradient of the loss, which leads us to a minimum. In cases where the found minimum is merely a local minimum, this can be seen as a false vacuum in our loss space. Exploration mechanisms try to let our training procedure escape these stable states: Making them metastable.
In order to archive this, this Repo contains some extensions for Stable Baselines 3 by DLR-RM
These extensions include:
- An implementation of "Differentiable Trust Region Layers for Deep Reinforcement Learning" by Fabian Otto et al.
- Support for Contextual Covariances
- Multiple parameterization strategies for the Covariance
The resulting algorithms can than be tested for their ability of exploration in the enviroments provided by Project Columbus
This Repo was created as part of my bachelor-thesis at ALR (KIT).
Install Project Columbus by following the instructions in the repo.
Follow instructions for the Public Version (GitHub Mirror) / Private Version (GitHub Mirror). The private version also requires ALR's ITPAL as a dependency. Only the private version supports KL Projections.
Then install this repo as a package:
pip install -e .
Since this Repo is an extension to Stable Baselines 3 by DLR-RM, it contains some of it's code. SB3 is licensed under the MIT-License.