contextual-bandits
There are 56 repositories under contextual-bandits topic.
VowpalWabbit/vowpal_wabbit
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
tensorflow/agents
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
david-cortes/contextualbandits
Python implementations of contextual bandits algorithms
st-tech/zr-obp
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
fidelity/mabwiser
[IJAIT 2021] MABWiser: Contextual Multi-Armed Bandits Library
alison-carrera/onn
Online Deep Learning: Learning Deep Neural Networks on the Fly / Non-linear Contextual Bandit Algorithm (ONN_THS)
alison-carrera/mabalgs
:bust_in_silhouette: Multi-Armed Bandit Algorithms Library (MAB) :cop:
Nth-iteration-labs/contextual
Contextual Bandits in R - simulation and evaluation of Multi-Armed Bandit Policies
banditml/banditml
A lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.
instadeepai/catx
🐈⬛ Contextual bandits library for continuous action trees with smoothing in JAX
lil-lab/blocks
Blocks World -- Simulator, Code, and Models (Misra et al. EMNLP 2017)
pemami4911/sinkhorn-policy-gradient.pytorch
Code accompanying the paper "Learning Permutations with Sinkhorn Policy Gradient"
Heewon-Hailey/multi-armed-bandits-for-recommendation-systems
implement basic and contextual MAB algorithms for recommendation system
thunfischtoast/LinUCB
Contextual bandit algorithm called LinUCB / Linear Upper Confidence Bounds as proposed by Li, Langford and Schapire
doerlbh/MiniVox
Code for our ACML and INTERSPEECH papers: "Speaker Diarization as a Fully Online Bandit Learning Problem in MiniVox".
mmalekzadeh/privacy-preserving-bandits
Privacy-Preserving Bandits (MLSys'20)
improve-ai/python-ranker
Contextual Multi-Armed Bandit Platform for Scoring, Ranking & Decisions
RonyAbecidan/Neural-Thompson-Sampling
Study of the paper 'Neural Thompson Sampling' published in October 2020
jtcho/FairMachineLearning
Implementation of provably Rawlsian fair ML algorithms for contextual bandits.
thoughtworks/simplebandit
lightweight contextual bandit library for ts/js
improve-ai/swift-ranker
Easily Score & Rank Codable Objects with ML
sparsh-ai/reco-bandit
Building recommender Systems using contextual bandit methods to address cold-start issue and online real-time learning
travisbrady/ocaml-vw
OCaml bindings to vowpal wabbit
marlesson/meta-bandit-selector
The Contextual Meta-Bandit (CMB) can be used to select models using the context with online learning based on Reiforcement Learning problem. It's can be used for recommender system ensemble, A/B test, and other dynamic model selector problem.
hsm207/cb-trading
Code to trade the financial markets using Contextual Bandits
Nth-iteration-labs/streamingbandit-ui
Client that handles the administration of StreamingBandit online, or straight from your desktop. Setup and run streaming (contextual) bandit experiments in your browser.
improve-ai/tracker-trainer
Contextual Multi-Armed Bandit Reward Tracker & Model Trainer
zaid-g/ccb_tutorial
Contextual multi-armed bandit recommender system using Vowpal Wabbit
aaronkurz/hitl-ab-bpm
Business Process Improvement with Reinforcement Learning and Human-in-the-Loop.
doerlbh/dilemmaRL
Code for our PRICAI 2022 paper: "Online Learning in Iterated Prisoner's Dilemma to Mimic Human Behavior".
aldente0630/multi_armed_bandit
Experiment results using MAB algorithms in Yahoo! Front Page Today Module User Click Log dataset
jackgerrits/reductionml
Reduction-based machine learning framework with a focus on contextual bandits
Murtazali05/LinUCB
LinUCB with disjoint linear models
ngutowski/algossim
This repository aims at learning most popular MAB and CMAB algorithms and watch how they run. It is interesting for those wishing to start learning these topics.
saeedghoorchian/NCC-Bandits
Experiments for paper "Online Learning with Costly Features in Non-stationary Environments"
TheAmazingElys/NeuralBandit
Code of the NeuralBandit paper