Inidividual Project 3

Deep Q-learning Network(DQN)

Please don't revise test.py, environment.py and agent.py

Due Date

Thursday October 17, 2019 (23:59)

Total Points

100 (One Hundred)

Installation

Type the following command to install OpenAI Gym Atari environment in your virutal environment.

$ pip install opencv-python-headless gym==0.10.4 gym[atari]

Please refer to OpenAI's page if you have any problem while installing.

How to run :

training DQN:

$ python main.py --train_dqn

testing DQN:

$ python test.py --test_dqn

Goal

In this project, you will be asked to implement DQN to play Breakout. This project will be completed in Python 3 using Pytorch. The goal of your training is to get averaging reward in 100 episodes over 40 points in Breakout, with OpenAI's Atari wrapper & unclipped reward. For more details, please see the slides.

Deliverables

Please compress all the below files into a zipped file and submit the zip file (firstName_lastName_hw3.zip) to Canvas.

Trained Model
- Model file
- If your model is too large for Canvas, upload it to a cloud space and write download.sh to download the model
PDF Report
- Set of Experiments Performed:
  - Include a section describing the set of experiments that you performed
  - what structures you experimented with (i.e., number of layers, number of neurons in each layer)
  - what hyperparameters you varied (e.g., number of epochs of training, batch size and any other parameter values, weight initialization schema, activation function)
  - what kind of loss function you used and what kind of optimizer you used.
- Special skills: Include the skills which can improve the generation quality. Here are some tips may help. (Optional)
- Visualization: Learning curve of DQN.
  - X-axis: number of time steps
  - Y-axis: average reward in last 30 episodes.
Python Code
- All the code you implemented including sample codes.

Grading

Trained Model (50 points)
- Getting averaging reward in 100 episodes over 40 points in Breakout will get full credits.
- For every average reward below 40, you will be taken off 2 points. i.e., you will be taken off 2 points, if getting averaging reward in 100 episodes is 39 and taken off 4 points, if averaging reward is 38, so on so forth.
PDF Report (30 points)
- Set of parameters performed: 20 points
- Visualization: 10 points
Python Code (20 points)
- You can get full credits if the scripts can run successfully, otherwise you may loss some points based on your error.

bucky1995/P3