This code applies the upper confidence bound method on a synthetic dataset named "data.mat". The data is organized in the form of timesteps X num of ads. The objective is to reduce the regret between the best possible reward and the reward that this algorithm outputs. This is one type of determinisitic bandit problem with partial feedback. Before running the code, download "data.mat" and place it in the same folder as your code.