ashish1401/MAB-UCB

Code implementation of the Reinforcement Learning algorithm, Upper Confidence Bound, a famous algorithm to counter the Multi Armed Bandit problem. The program returns the number of pulls per "arm" to minimize the regret in accordance with the UCB Algorithm

Jupyter Notebook

No issues in this repository yet.