/open-ai-gym

Deep RL w open-ai-gym

Primary LanguageJupyter Notebook


#Description

###Solutions to Open AI Gym programs using different Reinforcement Learning and Deep Learning Approaches


##Solutions with Descriptions

#####Below is a list of some of the projects in the repo

  • rl_q_learning.py
  • Implementation of Q Learning Model Using Dynamic Programming, in which you have table that represents every state / action pair.
  • This model was able to solve several basic problems like FrozenLake and Taxi with minimal tuning.
  • cartpole - policy_gradient.ipynb
  • Implementation of a Policy Gradient / Value Gradient
  • The policy gradient is meant to learn overtime which action to take given an observed state, by comparing the true observed reward for a given state to the expected reward. The value gradient is responsible for providing the estimate for the future reward. The PG is implementing a version of logistic regression and the VG uses a 2-layer regression NN
  • cartpole - random_search.ipynb (monte carlo)
  • Uses random search to identify a set of weights such that heaviside_step_func( np.dot(state, weights) ), is able to beat the required benchmarks. This actually work pretty well for this task and you're able to quickly learn a set of weights that can perform as well as more advance techniques. Obviously would not scale to larger state space sizes
  • cartpole - kmeans with q_learning.ipynb
  • attempts to use kmean clustering to decrease the size of the state space so that a q learning model can be applied
  • this was particularly successful
  • cartpole - dqn.ipynb
  • implement a version of deep q learning with experience replay
  • the network takes an input of

#References

https://www.nervanasys.com/demystifying-deep-reinforcement-learning/

https://github.com/dennybritz/reinforcement-learning

http://kvfrans.com/simple-algoritms-for-solving-cartpole/

Neural Evolution: https://gist.github.com/DollarAkshay/14059981d90c98607339d3ee17d2f0e9#file-openai_cartpole_v0-py