/vw-bandit

Python implementation of multi-armed bandit using epsilon-greedy exploration and reward-average sampling estimation

Primary LanguageJupyter NotebookMIT LicenseMIT

Watchers