Implementation of common bandit algorithms for the stochastic Bernoulli setting -
- Epsilon-greedy
- UCB1
- Thompson Sampling
- Approximate Gittins Index (see lecture by Tor Lattimore - https://www.youtube.com/watch?v=p8AwKiudhZ4)
Implementation of common bandit algorithms for the Bernoulli setting.
Jupyter Notebook
Implementation of common bandit algorithms for the stochastic Bernoulli setting -