bandits
There are 41 repositories under bandits topic.
tensorflow/agents
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
yfletberliac/rlss-2019
Materials for the Practical Sessions of the Reinforcement Learning Summer School 2019: Bandits, RL & Deep RL (PyTorch).
banditml/banditml
A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
iheartradio/thomas
Another A/B test library
thoughtworks/simplebandit
lightweight contextual bandit library for ts/js
YRussac/WeightedLinearBandits
Code associated with the NeurIPS19 paper "Weighted Linear Bandits in Non-Stationary Environments"
annieyan/Bandits-using-UCB-algorithm
Thompson Sampling for Bandits using UCB policy
babaniyi/Deep-contextual-bandits
A benchmark to test decision-making algorithms for contextual-bandits. The library implements a variety of algorithms (many of them based on approximate Bayesian Neural Networks and Thompson sampling), and a number of real and syntethic data problems exhibiting a diverse set of properties.
DURUII/Replica-AUCB
🐯REPLICA of "Auction-based combinatorial multi-armed bandit mechanisms with strategic arms"
doerlbh/BanditZoo
Python library of bandits and RL agents in different real-world environments
jayeshk7/RL-Algorithms
Python implementation of common RL algorithms using OpenAI gym environments
doerlbh/dilemmaRL
Code for our PRICAI 2022 paper: "Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior".
kfoofw/applied_learning_articles
Collaborative project for documenting ML/DS learnings.
doerlbh/ABaCoDE
Code for our ICDMW 2018 paper: "Contextual Bandit with Adaptive Feature Extraction".
Nicolivain/RLD
Deep Reinforcement Learning Agents in Pytorch in a modular framework
anishacharya/Bandits-Online-Learning
Simple Implementations of Bandit Algorithms in python
doerlbh/BerlinUCB
Code for our AJCAI 2020 paper: "Online Semi-Supervised Learning in Contextual Bandits with Episodic Reward".
manome/python-mab
This project provides a simulation of multi-armed bandit problems. This implementation is based on the below paper. https://arxiv.org/abs/2308.14350.
TanguyUrvoy/pmlib
A python library for (finite) Partial Monitoring algorithms
alxthm/rld-project
Play Rock, Paper, Scissors (Kaggle competition) with Reinforcement Learning: bandits, tabular Q-learning and PPO with LSTM.
Ralyhu/CMAB-CC
Code and data for the paper "A Combinatorial Multi-Armed Bandit Approach to Correlation Clustering", DAMI 2023
foreverska/buffalo-gym
Multi-armed Bandit Gymnasium Environment
lasgroup/MaxMinLCB
Code for our paper "Bandits with Preference Feedback: A Stackelberg Game Perspective"
Nicolivain/trustful-bandits
A two armed bandit simulation and comparison with theoritical convergence
sarthakmittal92/multi-armed-bandits
Repository for the course project done as part of CS-747 (Foundations of Intelligent & Learning Agents) course at IIT Bombay in Autumn 2022.
ElianBelot/bernoulli-bandits
An exploration of multi-armed Bernoulli bandits in reinforcement learning, complete with experiments and observations.
krishnaw14/CS747-assignments
Foundations of Intelligent and Learning Agenet
MehranTaghian/prophet-inequlity-implementation
Implementation of the prophet inequalities
Zaidtech/OverTheWire
This repo contains all the stuff I encountered while playing OverTheWire games.
AlxBouras/NeuralRandUCB
Project for the RL course @ Université Laval
philinemey/BSE-T3-RL
Coursework, Stochastic Models and Optimization, BSE, Term 3, Class of 2022
riccardodv/COOP-learning
Study the interplay between communication and feedback in a cooperative online learning setting.
rohilrg/Online-Learning-Bandits-Reinforcement-Learning
An assignment for the implementation of Online Learning, Bandits and Reinforcement Learning
XiaoMutt/ucbc
Stanford CS234 Course Side Project
JoelJa835/MAB_Algorithms
Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.