nissymori
D1 student. Interested in Offline RL, Game AI, and JAX-based RL.
The University of TokyoTokyo, Japan
Pinned Repositories
mjx
Mjx: A framework for Mahjong AI research
a2c-minatar
CDA
code for our EMNLP2020 paper: Multilevel Text Alignment with Cross-Document Attention by Xuhui Zhou, Nikolaos Pappas, and Noah A. Smith
D4RL
A collection of reference environments for offline reinforcement learning
direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
JAX-CORL
Clean single-file implementation of offline RL algorithms in JAX
mjai
Game server for Japanese Mahjong AI.
nissymori.github.io
reinforce
A simple REINFORCE algorithm implementation in PyTorch
pgx
♟️ Vectorized RL game environments in JAX
nissymori's Repositories
nissymori/JAX-CORL
Clean single-file implementation of offline RL algorithms in JAX
nissymori/nissymori.github.io
nissymori/a2c-minatar
nissymori/CDA
code for our EMNLP2020 paper: Multilevel Text Alignment with Cross-Document Attention by Xuhui Zhou, Nikolaos Pappas, and Noah A. Smith
nissymori/D4RL
A collection of reference environments for offline reinforcement learning
nissymori/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
nissymori/mjai
Game server for Japanese Mahjong AI.
nissymori/reinforce
A simple REINFORCE algorithm implementation in PyTorch
nissymori/rejax
nissymori/SRPO
[NeurIPS 2023] The official code for paper "State Regularized Policy Optimization on Data with Dynamics Shift"
nissymori/td-gammon
TD-Gammon implementation