/deep_RL-multi-arm-bandit-exploration

This is an implementation of the Reinforcement Learning multi-arm-bandit experiment using different exploration techniques.

Primary LanguagePython

Multi-arm Bandits Exploration

This is an bandit experiment that implements different exploration techniques for a 10-arm testbed as described in the Reinforcement Learning Book by Sutton & Barto.

The exploration techniques covered include:

  • ε-greedy
  • Optimistic Initialization
  • UCB Exploration
  • Boltzmann (Softmax) Exploration

This experiment further compares the different exploration techniques and concludes on which is better to use in different settings.