/MAB-policies

Multi armed bandit problems: epsilon greedy, UCB1, thompson samling and exp3.

Primary LanguagePython

Stargazers