This is the algorithm for multi-armed-bandit problem using epsilon_greedy and softmax that tries to maximize the reward given the Gaussian mean of the distributions.
git clone https://github.com/Ali92hm/multi-armed-bandit.git
The library code is under the algorithm folder. But to see how to use the algorithm you can look at the demo.py script.
python demo.py
algorithm
├── LICENSE
├── demo.py - Demo of the algorithm in use
└── algorithm - Algorithm implementation
├── base_algorithm.py - Base class for the algorithms
├── epsilon_greedy.py - Epsilon-greedy algorithm
└── softmax.py - Softmax algorithm