##MAB-UCB
ashish1401/MAB-UCB
Code implementation of the Reinforcement Learning algorithm, Upper Confidence Bound, a famous algorithm to counter the Multi Armed Bandit problem. The program returns the number of pulls per "arm" to minimize the regret in accordance with the UCB Algorithm
Jupyter Notebook