Please don't revise, and
- Thursday October 29, 2020 (23:59)
- 100 (One Hundred)
In this project, we will provide a leaderboard and give 10 bonus points to the top 3 highest reward students!
Where to see the leaderboard
- We will create a discussion on Canvas and each of you can post your highest reward with a sreenshot. TA will summarize your posts and list the top 3 highest rewards and post it below.
- The leaderboard of Fall 2019 is also posted at the end of this page, you can check it out.
Leaderboard for Breakout-DQN Update Date: 10/05/2020 16:00
Top Date Name Score Note - We will create a discussion on Canvas and each of you can post your highest reward with a sreenshot. TA will summarize your posts and list the top 3 highest rewards and post it below.
How to elvaluate
- You should submit your lastest trained model and python code. TA will run your code to make sure the result is consistent with your screenshot.
How to grade
- Top 3 students on the leaderboard can get 10 bonus points for project 3.
Type the following command to install OpenAI Gym Atari environment in your virutal environment.
$ pip install opencv-python-headless gym==0.10.4 gym[atari]
Please refer to OpenAI's page if you have any problem while installing.
training DQN:
$ python --train_dqn
testing DQN:
$ python --test_dqn
In this project, you will be asked to implement DQN to play Breakout. This project will be completed in Python 3 using Pytorch. The goal of your training is to get averaging reward in 100 episodes over 40 points in Breakout, with OpenAI's Atari wrapper & unclipped reward. For more details, please see the slides.
Please compress all the below files into a zipped file and submit the zip file ( to Canvas.
Trained Model
- Model file (.pth)
- If your model is too large for Canvas, upload it to a cloud space and provide the download link
PDF Report
Set of Experiments Performed:
- Include a section describing the set of experiments that you performed
- what structures you experimented with (i.e., number of layers, number of neurons in each layer)
- what hyperparameters you varied (e.g., number of epochs of training, batch size and any other parameter values, weight initialization schema, activation function)
- what kind of loss function you used and what kind of optimizer you used.
Special skills: Include the skills which can improve the generation quality. Here are some tips may help. (Optional)
Visualization: Learning curve of DQN.
- X-axis: number of time steps
- Y-axis: average reward in last 30 episodes.
Python Code
- All the code you implemented including sample codes.
Trained Model (50 points)
- Getting averaging reward in 100 episodes over 40 points in Breakout will get full credits.
- For every average reward below 40, you will be taken off 2 points. i.e., you will be taken off 2 points, if getting averaging reward in 100 episodes is 39 and taken off 4 points, if averaging reward is 38, so on so forth.
PDF Report (30 points)
- Set of parameters performed: 20 points
- Visualization: 10 points
Python Code (20 points)
- You can get full credits if the scripts can run successfully, otherwise you may loss some points based on your error.
- Naive Pytorch Tutorial
- How to Save Model with Pytorch
- Official Pytorch Tutorial
- Official DQN Pytorch Tutorial
- Official DQN paper
- Rainbow: Combining Improvements in Deep Reinforcement Learning
- DQN Tutorial on Medium
- How to use Google Cloud Platform
- How to use Pytorch on GPU
- Other choice for GPU
- Use your own GPU
- Apply Ace account orTuring account from WPI
Top | Date | Name | Score |
1 | 10/22/2019 | Prathyush SP | 142.77 |
10/18/2019 | Prathyush SP | 81.07 | |
2 | 10/28/2019 | Sapan Agrawal | 91.34 |
3 | 11/1/2019 | Hanshen Yu | 86.82 |
4 | 10/31/2019 | Mohamed Mahdi Alouane | 80.24 |
5 | 10/26/2019 | Vamshi Krishna Uppununthala | 79.5 |
6 | 10/31/2019 | Sai Vineeth K V | 66.5 |
7 | 11/14/2019 | Cory neville | 59.96 |
8 | 10/24/2019 | Shreesha Narasimha Murthy | 56.79 |
9 | 10/20/2019 | Sinan Morcel | 53.26 |