/Reinforcement-Learning

Upper Confidence Bound (UCB) & Thompson Sampling.

Primary LanguagePython

Stargazers