/AttenPi

An end-to-end differentiable hierarchical reinforcement learning agent based on continuous sub-policy attention.

Primary LanguagePython

AttenPi

See project-poster.png

An end-to-end differentiable hierarchical reinforcement learning agent based on continuous sub-policy attention. (See attenpi-paper.pdf for more details)

  • experiments: Folder containing discarded work and expiriments.

  • figs: Folder containing plots of training data.

  • logs: Folder containing training progress plots and log files.

  • configs.py: Model configuration options for QuantumPolicy.

  • model_utils.py: Model subclasses and helper functions.

  • plot_data.py: Script to plot data from log files.

  • pretrain_cnn.py: Script to pretrain the CNN used for 'coinrun' environment.

  • quantum_policy.py: Contains the QuantumPolicy class, which implements the AttenPi algorithm.

  • train.py: Environment and script used to train QuantumPolicy

  • train_utils: General training script and helper functions.