nissymori
D1 student. Interested in Offline RL, Game AI, and JAX-based RL.
The University of TokyoTokyo, Japan
nissymori's Stars
srush/GPU-Puzzles
Solve puzzles. Learn CUDA.
samlobel/CFN
Accompanying Code for "Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning", ICML 2023
kenjyoung/MinAtar
nissymori/JAX-CORL
Clean single-file implementation of offline RL algorithms in JAX
DorsaRoh/Machine-Learning
Machine learning from scratch
keraJLi/synthetic-gymnax
k4ntz/HackAtari
sotetsuk/brl
reinforcement learning for bridge
mttga/purejaxql
Simple single-file baselines for Q-Learning in pure-GPU setting
kvfrans/jax-diffusion-transformer
Implementation of Diffusion Transformer (DiT) in JAX
DHDev0/Stochastic-muzero
Pytorch Implementation of Stochastic MuZero for gym environment. This algorithm is capable of supporting a wide range of action and observation spaces, including both discrete and continuous variations.
ZhengyaoJiang/latentplan
Code release for Efficient Planning in a Compact Latent Action Space (ICLR2023) https://arxiv.org/abs/2208.10291.
instadeepai/og-marl
Datasets with baselines for offline multi-agent reinforcement learning.
awesome-mlss/awesome-mlss
🤖 Machine Learning Summer School deadlines
CheeksTheGeek/PyJSONCanvas
A simple library for working with JSON Canvas (previously known as Obsidian Canvas) files.
keraJLi/rejax
pickxiguapi/Uni-RLHF-Platform
Uni-RLHF platform for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)
pickxiguapi/Clean-Offline-RLHF
Offline RLHF codebase implementation for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)
JohannesAck/OfflineRLStructuredNonstationarity
Implementation for RLC paper "Offline Reinforcement Learning from Datasets with Structured Non-Stationarity".
Kaixhin/imitation-learning
Imitation learning algorithms
Improbable-AI/harness-offline-rl
Official implementation of Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Reweighting
williamd4112/suboptimal_offline_datasets
Improbable-AI/dw-offline-rl
Official implementation of NeurIPS'23 paper, Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets
kristery/Elastic-DT
[NeurIPS 2023] Implementation of Elastic Decision Transformer
Dragon-Zhuang/Reinformer
Official code for ICML 2024 paper Reinformer: Max-Return Sequence Modeling for offline RL
google-deepmind/searchless_chess
Grandmaster-Level Chess Without Search
Baichenjia/UTDS
Pessimistic Value Iteration for Multi-Task Data Sharing in Offline RL
araffin/sbx
SBX: Stable Baselines Jax (SB3 + Jax)
BirkhoffG/jax-dataloader
Pytorch-like dataloaders in JAX.
google/grain