UCB Momentum Q-learning: Correcting the bias without forgetting

Paper available here.

The algorithm and the baselines are implemented in the folder algorithms/. The folder config/ contains the parameters defining the experiments.

  • Requirements:

    • Python 3.7
    • rlberry version 0.1
    • pyyaml
  • Create and activate conda virtual environment (optional)

$ conda create -n ucbmq_env python=3.7
$ conda activate ucbmq_env
  • Install requirements
$ pip install 'rlberry[full]==0.1'
$ pip install pyyaml
  • Run and plot
$ python run.py config/experiment.yaml --n_fit=8
$ python plot.py