bakanaouji's Stars
openai/spinningup
An educational resource to help anyone learn deep reinforcement learning.
paperswithcode/ai-deadlines
:alarm_clock: AI conference deadline countdowns
clibs/clib
Package manager for the C programming language.
google-deepmind/open_spiel
OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
datamllab/rlcard
Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
openai/multi-agent-emergence-environments
Environment generation code for the paper "Emergent Tool Use From Multi-Agent Autocurricula"
pfnet/pfrl
PFRL: a PyTorch-based deep reinforcement learning library
eleurent/phd-bibliography
References on Optimal Control, Reinforcement Learning and Motion Planning
david-cortes/contextualbandits
Python implementations of contextual bandits algorithms
marcharper/python-ternary
:small_red_triangle: Ternary plotting library for python with matplotlib
jannerm/trajectory-transformer
Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"
Kaggle/kaggle-environments
antonismand/Personalized-News-Recommendation
Multi Armed Bandits implementation using the Yahoo! Front Page Today Module User Click Log Dataset
CyberAgentAILab/minituna
A toy hyperparameter optimization framework intended for understanding Optuna's internal design.
bakanaouji/cpp-cfr
C++ implementations of Counterfactual Regret Minimization and Monte Carlo CFR
laonahongchen/Bilevel-Optimization-in-Coordination-Game
code implementation for 'Bi-level Actor-Critic for Multi-agent Coordination'(AAAI2020)
shiqiangw/iclr2024-scores
criteo-research/optimization-continuous-action-crm
CausalML/DoubleReinforcementLearningMDP
gisoo1989/Doubly-Robust-Lasso-Bandit
CyberAgentAILab/thresholded-lasso-bandit
c-bata/sandbox-atcoder
CyberAgentAILab/mcts-capacity-expansion
CyberAgentAILab/mutant-ftrl
CyberAgentAILab/adaptively-perturbed-md
CyberAgentAILab/m2wu
denizalp/min-max-fisher
jannerm/d4rl
A benchmark for offline reinforcement learning.
mohitkarnani/matching-code
Implementation of Gale-Shapley deferred acceptance algorithm in MATLAB.
KentaroToyoshima/fisher-gda