Contextual bandit example

This repo contains a simple Python implementation of a contextual bandit. The bandit maintains one classification model per arm, in order to predict the expected reward for advertisment optimisation problem (i.e. click-through rate). Exploration is done 10% of the time -- you can edit this by changing the epsilon parameter in app.py.

For regression problems, where the reward is not binary but a real number, change mode='classification' to mode='regression'.

Requirements

The bandit requires scikit, scipy and numpy. To install them all:

 pip install -U scikit-learn scipy numpy

Running the simulation

python3 app.py

kennu/bandits

Contextual bandit example

Requirements

Running the simulation