bandits

There are 41 repositories under bandits topic.

tensorflow/agents
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Language:Python2.9k 78 674725
yfletberliac/rlss-2019
Materials for the Practical Sessions of the Reinforcement Learning Summer School 2019: Bandits, RL & Deep RL (PyTorch).
Language:Jupyter Notebook89 10 044
banditml/banditml
A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
Language:Python66 5 310
iheartradio/thomas
Another A/B test library
Language:Scala24 17 38
thoughtworks/simplebandit
lightweight contextual bandit library for ts/js
Language:TypeScript18 12 20
YRussac/WeightedLinearBandits
Code associated with the NeurIPS19 paper "Weighted Linear Bandits in Non-Stationary Environments"
Language:Jupyter Notebook17 4 12
annieyan/Bandits-using-UCB-algorithm
Thompson Sampling for Bandits using UCB policy
Language:Python10 2 03
babaniyi/Deep-contextual-bandits
A benchmark to test decision-making algorithms for contextual-bandits. The library implements a variety of algorithms (many of them based on approximate Bayesian Neural Networks and Thompson sampling), and a number of real and syntethic data problems exhibiting a diverse set of properties.
Language:Python9 1 11
DURUII/Replica-AUCB
🐯REPLICA of "Auction-based combinatorial multi-armed bandit mechanisms with strategic arms"
Language:Python9 1 00
doerlbh/BanditZoo
Python library of bandits and RL agents in different real-world environments
Language:Python7 1 24
jayeshk7/RL-Algorithms
Python implementation of common RL algorithms using OpenAI gym environments
Language:Python7 1 00
doerlbh/dilemmaRL
Code for our PRICAI 2022 paper: "Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior".
Language:Python6 2 00
kfoofw/applied_learning_articles
Collaborative project for documenting ML/DS learnings.
Language:Jupyter Notebook6 3 13
doerlbh/ABaCoDE
Code for our ICDMW 2018 paper: "Contextual Bandit with Adaptive Feature Extraction".
Language:MATLAB5 2 01
Nicolivain/RLD
Deep Reinforcement Learning Agents in Pytorch in a modular framework
Language:Jupyter Notebook5 1 01
anishacharya/Bandits-Online-Learning
Simple Implementations of Bandit Algorithms in python
Language:Jupyter Notebook4 3 00
doerlbh/BerlinUCB
Code for our AJCAI 2020 paper: "Online Semi-Supervised Learning in Contextual Bandits with Episodic Reward".
Language:MATLAB4 1 00
manome/python-mab
This project provides a simulation of multi-armed bandit problems. This implementation is based on the below paper. https://arxiv.org/abs/2308.14350.
Language:Python4 1 00
TanguyUrvoy/pmlib
A python library for (finite) Partial Monitoring algorithms
Language:Jupyter Notebook4 2 11
alxthm/rld-project
Play Rock, Paper, Scissors (Kaggle competition) with Reinforcement Learning: bandits, tabular Q-learning and PPO with LSTM.
Language:Python3 1 12
Ralyhu/CMAB-CC
Code and data for the paper "A Combinatorial Multi-Armed Bandit Approach to Correlation Clustering", DAMI 2023
Language:Python3 1 01
foreverska/buffalo-gym
Multi-armed Bandit Gymnasium Environment
Language:Python2 1 00
gurbaaz27/amazon-hackathon
Language:Jupyter Notebook2 2 02
lasgroup/MaxMinLCB
Code for our paper "Bandits with Preference Feedback: A Stackelberg Game Perspective"
Language:Python2 3 01
Nicolivain/trustful-bandits
A two armed bandit simulation and comparison with theoritical convergence
Language:Jupyter Notebook2 1 00
sarthakmittal92/multi-armed-bandits
Repository for the course project done as part of CS-747 (Foundations of Intelligent & Learning Agents) course at IIT Bombay in Autumn 2022.
Language:Python2 1 00
ElianBelot/bernoulli-bandits
An exploration of multi-armed Bernoulli bandits in reinforcement learning, complete with experiments and observations.
Language:Jupyter Notebook1 1 00
krishnaw14/CS747-assignments
Foundations of Intelligent and Learning Agenet
Language:Python1 1 0
MehranTaghian/prophet-inequlity-implementation
Implementation of the prophet inequalities
Language:Python1 2 00
Zaidtech/OverTheWire
This repo contains all the stuff I encountered while playing OverTheWire games.
1 1 00
AlxBouras/NeuralRandUCB
Project for the RL course @ Université Laval
Language:Python0 2 00
philinemey/BSE-T3-RL
Coursework, Stochastic Models and Optimization, BSE, Term 3, Class of 2022
Language:Jupyter Notebook0 1 00
riccardodv/COOP-learning
Study the interplay between communication and feedback in a cooperative online learning setting.
Language:Python0 1 00
rohilrg/Online-Learning-Bandits-Reinforcement-Learning
An assignment for the implementation of Online Learning, Bandits and Reinforcement Learning
Language:Jupyter Notebook0 2 01
XiaoMutt/ucbc
Stanford CS234 Course Side Project
Language:Python0 2 00
JoelJa835/MAB_Algorithms
Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.
Language:Python2 0

bandits

tensorflow/agents

yfletberliac/rlss-2019

banditml/banditml

iheartradio/thomas

thoughtworks/simplebandit

YRussac/WeightedLinearBandits

annieyan/Bandits-using-UCB-algorithm

babaniyi/Deep-contextual-bandits

DURUII/Replica-AUCB

doerlbh/BanditZoo

jayeshk7/RL-Algorithms

doerlbh/dilemmaRL

kfoofw/applied_learning_articles

doerlbh/ABaCoDE

Nicolivain/RLD

anishacharya/Bandits-Online-Learning

doerlbh/BerlinUCB

manome/python-mab

TanguyUrvoy/pmlib

alxthm/rld-project

Ralyhu/CMAB-CC

foreverska/buffalo-gym

gurbaaz27/amazon-hackathon

lasgroup/MaxMinLCB

Nicolivain/trustful-bandits

sarthakmittal92/multi-armed-bandits

ElianBelot/bernoulli-bandits

krishnaw14/CS747-assignments

MehranTaghian/prophet-inequlity-implementation

Zaidtech/OverTheWire

AlxBouras/NeuralRandUCB

philinemey/BSE-T3-RL

riccardodv/COOP-learning

rohilrg/Online-Learning-Bandits-Reinforcement-Learning

XiaoMutt/ucbc

JoelJa835/MAB_Algorithms