/BreakoutRL

Primary LanguageJupyter Notebook

Inidividual Project 3

Deep Q-learning Network(DQN)

Please don't revise test.py, environment.py and agent.py

Due Date

  • Thursday October 29, 2020 (23:59)

Total Points

  • 100 (One Hundred)

Leaderboard and Bonus Points

In this project, we will provide a leaderboard and give 10 bonus points to the top 3 highest reward students!

  • Where to see the leaderboard

    • We will create a discussion on Canvas and each of you can post your highest reward with a sreenshot. TA will summarize your posts and list the top 3 highest rewards and post it below.
    • The leaderboard of Fall 2019 is also posted at the end of this page, you can check it out.

    Leaderboard for Breakout-DQN Update Date: 10/05/2020 16:00

    Top Date Name Score Note
  • How to elvaluate

    • You should submit your lastest trained model and python code. TA will run your code to make sure the result is consistent with your screenshot.
  • How to grade

    • Top 3 students on the leaderboard can get 10 bonus points for project 3.

Installation

Type the following command to install OpenAI Gym Atari environment in your virutal environment.

$ pip install opencv-python-headless gym==0.10.4 gym[atari]

Please refer to OpenAI's page if you have any problem while installing.

How to run :

training DQN:

  • $ python main.py --train_dqn

testing DQN:

  • $ python test.py --test_dqn

Goal

In this project, you will be asked to implement DQN to play Breakout. This project will be completed in Python 3 using Pytorch. The goal of your training is to get averaging reward in 100 episodes over 40 points in Breakout, with OpenAI's Atari wrapper & unclipped reward. For more details, please see the slides.

Deliverables

Please compress all the below files into a zipped file and submit the zip file (firstName_lastName_hw3.zip) to Canvas.

  • Trained Model

    • Model file (.pth)
    • If your model is too large for Canvas, upload it to a cloud space and provide the download link
  • PDF Report

    • Set of Experiments Performed:

      • Include a section describing the set of experiments that you performed
      • what structures you experimented with (i.e., number of layers, number of neurons in each layer)
      • what hyperparameters you varied (e.g., number of epochs of training, batch size and any other parameter values, weight initialization schema, activation function)
      • what kind of loss function you used and what kind of optimizer you used.
    • Special skills: Include the skills which can improve the generation quality. Here are some tips may help. (Optional)

    • Visualization: Learning curve of DQN.

      • X-axis: number of time steps
      • Y-axis: average reward in last 30 episodes.
  • Python Code

    • All the code you implemented including sample codes.

Grading

  • Trained Model (50 points)

    • Getting averaging reward in 100 episodes over 40 points in Breakout will get full credits.
    • For every average reward below 40, you will be taken off 2 points. i.e., you will be taken off 2 points, if getting averaging reward in 100 episodes is 39 and taken off 4 points, if averaging reward is 38, so on so forth.
  • PDF Report (30 points)

    • Set of parameters performed: 20 points
    • Visualization: 10 points
  • Python Code (20 points)

    • You can get full credits if the scripts can run successfully, otherwise you may loss some points based on your error.

Hints

Tips for Using GPU on Google Cloud

Leaderboard for Fall 2019**

Top Date Name Score
1 10/22/2019 Prathyush SP 142.77
10/18/2019 Prathyush SP 81.07
2 10/28/2019 Sapan Agrawal 91.34
3 11/1/2019 Hanshen Yu 86.82
4 10/31/2019 Mohamed Mahdi Alouane 80.24
5 10/26/2019 Vamshi Krishna Uppununthala 79.5
6 10/31/2019 Sai Vineeth K V 66.5
7 11/14/2019 Cory neville 59.96
8 10/24/2019 Shreesha Narasimha Murthy 56.79
9 10/20/2019 Sinan Morcel 53.26