mkowoods/open-ai-gym

Deep RL w open-ai-gym

Jupyter Notebook

#Description

###Solutions to Open AI Gym programs using different Reinforcement Learning and Deep Learning Approaches

##Solutions with Descriptions

#####Below is a list of some of the projects in the repo

rl_q_learning.py

Implementation of Q Learning Model Using Dynamic Programming, in which you have table that represents every state / action pair.
This model was able to solve several basic problems like FrozenLake and Taxi with minimal tuning.

cartpole - policy_gradient.ipynb

Implementation of a Policy Gradient / Value Gradient
The policy gradient is meant to learn overtime which action to take given an observed state, by comparing the true observed reward for a given state to the expected reward. The value gradient is responsible for providing the estimate for the future reward. The PG is implementing a version of logistic regression and the VG uses a 2-layer regression NN

cartpole - random_search.ipynb (monte carlo)

Uses random search to identify a set of weights such that heaviside_step_func( np.dot(state, weights) ), is able to beat the required benchmarks. This actually work pretty well for this task and you're able to quickly learn a set of weights that can perform as well as more advance techniques. Obviously would not scale to larger state space sizes

cartpole - kmeans with q_learning.ipynb

attempts to use kmean clustering to decrease the size of the state space so that a q learning model can be applied
this was particularly successful

cartpole - dqn.ipynb

implement a version of deep q learning with experience replay
the network takes an input of

#References

https://www.nervanasys.com/demystifying-deep-reinforcement-learning/

https://github.com/dennybritz/reinforcement-learning

http://kvfrans.com/simple-algoritms-for-solving-cartpole/

Neural Evolution: https://gist.github.com/DollarAkshay/14059981d90c98607339d3ee17d2f0e9#file-openai_cartpole_v0-py