/relax_rainbow_dqn_example

Example Rainbow DQN implementation with ReLAx

Primary LanguageJupyter Notebook

Example Rainbow DQN implementation with ReLAx

This repository contains an implementation of rainbow deep q-network (Rainbow DQN) with ReLAx.

Rainbow DQN actor was trained on MsPacman-v0 Atari Gym environment for 3m env-steps. Trained models exceed GitHub's 100mb size limit and may be found here.

!Note: For demonstration purposes training was run only for 3m steps. In papers, DQN and its augmentations are trained for 200m steps, which may require several days of learning. That is why performance is lower than reported in papers.

The graph of average return vs environment step is shown below (logs done every 50k steps):

rainbow_dqn_training

The distribution of estimated Q-values vs data Q-values is shown below:

rainbow_dqn_q_func

Resulting Policy:

rainbow_dqn_run.mp4