bandits

There are 41 repositories under bandits topic.

  • tensorflow/agents

    TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

    Language:Python2.9k78674725
  • yfletberliac/rlss-2019

    Materials for the Practical Sessions of the Reinforcement Learning Summer School 2019: Bandits, RL & Deep RL (PyTorch).

    Language:Jupyter Notebook8910044
  • banditml/banditml

    A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.

    Language:Python665310
  • iheartradio/thomas

    Another A/B test library

    Language:Scala241738
  • thoughtworks/simplebandit

    lightweight contextual bandit library for ts/js

    Language:TypeScript181220
  • YRussac/WeightedLinearBandits

    Code associated with the NeurIPS19 paper "Weighted Linear Bandits in Non-Stationary Environments"

    Language:Jupyter Notebook17412
  • annieyan/Bandits-using-UCB-algorithm

    Thompson Sampling for Bandits using UCB policy

    Language:Python10203
  • babaniyi/Deep-contextual-bandits

    A benchmark to test decision-making algorithms for contextual-bandits. The library implements a variety of algorithms (many of them based on approximate Bayesian Neural Networks and Thompson sampling), and a number of real and syntethic data problems exhibiting a diverse set of properties.

    Language:Python9111
  • DURUII/Replica-AUCB

    🐯REPLICA of "Auction-based combinatorial multi-armed bandit mechanisms with strategic arms"

    Language:Python9100
  • doerlbh/BanditZoo

    Python library of bandits and RL agents in different real-world environments

    Language:Python7124
  • jayeshk7/RL-Algorithms

    Python implementation of common RL algorithms using OpenAI gym environments

    Language:Python7100
  • doerlbh/dilemmaRL

    Code for our PRICAI 2022 paper: "Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior".

    Language:Python6200
  • kfoofw/applied_learning_articles

    Collaborative project for documenting ML/DS learnings.

    Language:Jupyter Notebook6313
  • doerlbh/ABaCoDE

    Code for our ICDMW 2018 paper: "Contextual Bandit with Adaptive Feature Extraction".

    Language:MATLAB5201
  • Nicolivain/RLD

    Deep Reinforcement Learning Agents in Pytorch in a modular framework

    Language:Jupyter Notebook5101
  • anishacharya/Bandits-Online-Learning

    Simple Implementations of Bandit Algorithms in python

    Language:Jupyter Notebook4300
  • doerlbh/BerlinUCB

    Code for our AJCAI 2020 paper: "Online Semi-Supervised Learning in Contextual Bandits with Episodic Reward".

    Language:MATLAB4100
  • manome/python-mab

    This project provides a simulation of multi-armed bandit problems. This implementation is based on the below paper. https://arxiv.org/abs/2308.14350.

    Language:Python4100
  • TanguyUrvoy/pmlib

    A python library for (finite) Partial Monitoring algorithms

    Language:Jupyter Notebook4211
  • alxthm/rld-project

    Play Rock, Paper, Scissors (Kaggle competition) with Reinforcement Learning: bandits, tabular Q-learning and PPO with LSTM.

    Language:Python3112
  • Ralyhu/CMAB-CC

    Code and data for the paper "A Combinatorial Multi-Armed Bandit Approach to Correlation Clustering", DAMI 2023

    Language:Python3101
  • foreverska/buffalo-gym

    Multi-armed Bandit Gymnasium Environment

    Language:Python2100
  • gurbaaz27/amazon-hackathon

    Language:Jupyter Notebook2202
  • lasgroup/MaxMinLCB

    Code for our paper "Bandits with Preference Feedback: A Stackelberg Game Perspective"

    Language:Python2301
  • Nicolivain/trustful-bandits

    A two armed bandit simulation and comparison with theoritical convergence

    Language:Jupyter Notebook2100
  • sarthakmittal92/multi-armed-bandits

    Repository for the course project done as part of CS-747 (Foundations of Intelligent & Learning Agents) course at IIT Bombay in Autumn 2022.

    Language:Python2100
  • ElianBelot/bernoulli-bandits

    An exploration of multi-armed Bernoulli bandits in reinforcement learning, complete with experiments and observations.

    Language:Jupyter Notebook1100
  • krishnaw14/CS747-assignments

    Foundations of Intelligent and Learning Agenet

    Language:Python110
  • MehranTaghian/prophet-inequlity-implementation

    Implementation of the prophet inequalities

    Language:Python1200
  • Zaidtech/OverTheWire

    This repo contains all the stuff I encountered while playing OverTheWire games.

  • AlxBouras/NeuralRandUCB

    Project for the RL course @ Université Laval

    Language:Python0200
  • philinemey/BSE-T3-RL

    Coursework, Stochastic Models and Optimization, BSE, Term 3, Class of 2022

    Language:Jupyter Notebook0100
  • riccardodv/COOP-learning

    Study the interplay between communication and feedback in a cooperative online learning setting.

    Language:Python0100
  • rohilrg/Online-Learning-Bandits-Reinforcement-Learning

    An assignment for the implementation of Online Learning, Bandits and Reinforcement Learning

    Language:Jupyter Notebook0201
  • XiaoMutt/ucbc

    Stanford CS234 Course Side Project

    Language:Python0200
  • JoelJa835/MAB_Algorithms

    Implementation of Multi-Armed Bandit (MAB) algorithms UCB and Epsilon-Greedy. MAB is a class of problems in reinforcement learning where an agent learns to choose actions from a set of arms, each associated with an unknown reward distribution. UCB and Epsilon-Greedy are popular algorithms for solving MAB problems.

    Language:Python20