Varitional Reinfocement Learning (VRL) via Quantum Tensor Networks

This repository provides the implementation for VRL algorithm and applies it to the classic Gridworld puzzle.

NeurIPS 2020 Workshop: First Workshop on Quantum Tensor Networks in Machine Learning

Quantum Tensor Networks for Variational Reinforcement Learning

News We utilize the H-term as an add-on term to improve the stability of the DRL algorithms in the ElegantRL project, e.g., AgentPPO_H.

Usage

First, import the VRL module by

from VRL import *

Then, initialize the parameters. An example initialization would be

s, a = 25, 4
k, gamma, chi = 3, 0.5, 50
R, R_vec = initialize_R(s, a)
P, P_mat = initialize_P(s, a)
five_tuple = s, a, P_mat, R_vec, gamma

Now, explore the environment and train the model by running

omega = explore_env(1000, k, R, P, s, a)
H, cores, data = build_network(five_tuple, k, chi, omega)
spin, energy_history = VRL_train(five_tuple, k, data, cores, lr=0.1, epochs=1000)

One can retrieve the trained policy from the variable spin; the processes of energy minimization is tracked by energy_history.

An example notebook is provided in the root directory.