RL_DQN_Pixelcopter

Project for Baidu RL course

This project employed the DQN algorithm from PARL of Baidu.

All the needed functions were written in the single python file, in which:

The most functions, e.g. Model(), Agent(), ReplayMemory(), as well as part of main(), etc. are indentical as or were slightly modified based on the materials provided by the Baidu RL course.
The preprocessing (scaling) of the state was inspired by nbuliyang's project
The needed libraries and corresponding versions are documented in requirements.txt

Results

The three figures below show the test_reward (mean value of 5 test episodes) and the max_reward (the maximum value among the 5 test episodes):

at the beginning of the training
around 3000 episodes
around 4000 episodes

Videos

At the beginning of the experiment:

Around 3000 episodes:

qingnansun/RL_DQN_Pixelcopter

RL_DQN_Pixelcopter

Project for Baidu RL course

Results

Videos