ashish1401/MAB-UCB
Code implementation of the Reinforcement Learning algorithm, Upper Confidence Bound, a famous algorithm to counter the Multi Armed Bandit problem. The program returns the number of pulls per "arm" to minimize the regret in accordance with the UCB Algorithm
Jupyter Notebook
No issues in this repository yet.