bakanaouji

https://twitter.com/bakanaouji

CyberAgent, Inc.Tokyo, Japan

bakanaouji's Stars

openai/spinningup
An educational resource to help anyone learn deep reinforcement learning.
Language:Python10.1k 227 2812.2k
paperswithcode/ai-deadlines
:alarm_clock: AI conference deadline countdowns
Language:JavaScript5.6k 100 94961
clibs/clib
Package manager for the C programming language.
Language:C4.9k 130 159243
google-deepmind/open_spiel
OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games.
Language:C++4.2k 108 555932
datamllab/rlcard
Reinforcement Learning / AI Bots in Card (Poker) Games - Blackjack, Leduc, Texas, DouDizhu, Mahjong, UNO.
Language:Python2.9k 74 198626
openai/multi-agent-emergence-environments
Environment generation code for the paper "Emergent Tool Use From Multi-Agent Autocurricula"
Language:Python1.6k 187 31306
pfnet/pfrl
PFRL: a PyTorch-based deep reinforcement learning library
Language:Python1.2k 91 75157
eleurent/phd-bibliography
References on Optimal Control, Reinforcement Learning and Motion Planning
921 38 2205
david-cortes/contextualbandits
Python implementations of contextual bandits algorithms
Language:Python744 23 58146
marcharper/python-ternary
:small_red_triangle: Ternary plotting library for python with matplotlib
Language:Python727 17 139156
jannerm/trajectory-transformer
Code for the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem"
Language:Python459 6 2065
Kaggle/kaggle-environments
Language:Jupyter Notebook289 38 70147
antonismand/Personalized-News-Recommendation
Multi Armed Bandits implementation using the Yahoo! Front Page Today Module User Click Log Dataset
Language:Jupyter Notebook93 2 024
CyberAgentAILab/minituna
A toy hyperparameter optimization framework intended for understanding Optuna's internal design.
Language:Python83 4 09
bakanaouji/cpp-cfr
C++ implementations of Counterfactual Regret Minimization and Monte Carlo CFR
Language:C++71 4 210
laonahongchen/Bilevel-Optimization-in-Coordination-Game
code implementation for 'Bi-level Actor-Critic for Multi-agent Coordination'(AAAI2020)
Language:Python54 3 124
shiqiangw/iclr2024-scores
Language:Python52 2 32
criteo-research/optimization-continuous-action-crm
Language:Python30 7 02
CausalML/DoubleReinforcementLearningMDP
Language:Python10 2 04
gisoo1989/Doubly-Robust-Lasso-Bandit
Language:Python7
CyberAgentAILab/thresholded-lasso-bandit
Language:Python5 1 00
c-bata/sandbox-atcoder
Language:C++3 3 2
CyberAgentAILab/mcts-capacity-expansion
Language:Python3 1 01
CyberAgentAILab/mutant-ftrl
Language:Python3 1 00
CyberAgentAILab/adaptively-perturbed-md
Language:Python2 1 01
CyberAgentAILab/m2wu
Language:Python2 2 00
denizalp/min-max-fisher
Language:Python20
jannerm/d4rl
A benchmark for offline reinforcement learning.
Language:Python2 1 03
mohitkarnani/matching-code
Implementation of Gale-Shapley deferred acceptance algorithm in MATLAB.
Language:MATLAB21
KentaroToyoshima/fisher-gda
Language:Python1

bakanaouji

bakanaouji's Stars

openai/spinningup

paperswithcode/ai-deadlines

clibs/clib

google-deepmind/open_spiel

datamllab/rlcard

openai/multi-agent-emergence-environments

pfnet/pfrl

eleurent/phd-bibliography

david-cortes/contextualbandits

marcharper/python-ternary

jannerm/trajectory-transformer

Kaggle/kaggle-environments

antonismand/Personalized-News-Recommendation

CyberAgentAILab/minituna

bakanaouji/cpp-cfr

laonahongchen/Bilevel-Optimization-in-Coordination-Game

shiqiangw/iclr2024-scores

criteo-research/optimization-continuous-action-crm

CausalML/DoubleReinforcementLearningMDP

gisoo1989/Doubly-Robust-Lasso-Bandit

CyberAgentAILab/thresholded-lasso-bandit

c-bata/sandbox-atcoder

CyberAgentAILab/mcts-capacity-expansion

CyberAgentAILab/mutant-ftrl

CyberAgentAILab/adaptively-perturbed-md

CyberAgentAILab/m2wu

denizalp/min-max-fisher

jannerm/d4rl

mohitkarnani/matching-code

KentaroToyoshima/fisher-gda