/Hyperparameter-Optimization-for-Deep-Q-Networks

Project for Deep Learning Systems Performance course at Columbia University

Primary LanguageJupyter Notebook

Hyperparameter-Optimization-for-Deep-Q-Networks

Final Project for COMS 6998 Deep Learning Systems Performance at Columbia University

Collaborator: Ananth Ravi Kumar (https://www.github.com/arkwave)

References: https://github.com/sweetice/Deep-reinforcement-learning-with-pytorch (modified DQN files from this repo)

To-Do

  1. Freeze different sets of hyperparameters when using successive halving to figure out the most significant ones.
  2. Use hyperband to search for hyperparameters.
  3. Clean up notebooks into more uniform code that can be run efficiently.
  4. Add a requirements.txt to shorten code & have cleaner output.

Projection Description

To measure the sensitivity of Deep Q-Networks on different tasks subject to learning rate, batch size, optimizer, target Q network update step size, discount factor, and other hyperparameters to identify the relationship between hyperparameters and efficient convergence to the optimal policy across different state/action regimes.

Methods Implemented

Random Search
Successive Halving
Bayesian Optimization

Implementation Details

View report for in-depth details about our implementation.

File Descriptions

The notebooks can be downloaded and ran as is, but it's important to run the cells in order or things may not work. For successive halving & random search there are two notebooks are each. One is the implementation and the other is visualization of the agent (will soon combine these into one notebook). Bayesianopt.ipynb includes the bayesian optimization implementation. The reason I separated SH & RS into two was because during visualization, colab would sometimes crash and I didn't want to re-run everything above that.