nissymori

D1 student. Interested in Offline RL, Game AI, and JAX-based RL.

The University of TokyoTokyo, Japan

Pinned Repositories

mjx
Mjx: A framework for Mahjong AI research
Language:C++175 6 42020
a2c-minatar
Language:Python0 0 00
CDA
code for our EMNLP2020 paper: Multilevel Text Alignment with Cross-Document Attention by Xuhui Zhou, Nikolaos Pappas, and Noah A. Smith
Language:Python0 0 00
D4RL
A collection of reference environments for offline reinforcement learning
Language:Python0 0 00
direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
Language:Python00
JAX-CORL
Clean single-file implementation of offline RL algorithms in JAX
Language:Python134 4 252
mjai
Game server for Japanese Mahjong AI.
Language:Ruby0 0 00
nissymori.github.io
Language:HTML1 1 00
reinforce
A simple REINFORCE algorithm implementation in PyTorch
Language:Python0 0 00
pgx
♟️ Vectorized RL game environments in JAX
Language:Python444 8 24533

nissymori's Repositories

nissymori/JAX-CORL
Clean single-file implementation of offline RL algorithms in JAX
Language:Python134 4 252
nissymori/nissymori.github.io
Language:HTML1 1 00
nissymori/a2c-minatar
Language:Python0 0 00
nissymori/CDA
code for our EMNLP2020 paper: Multilevel Text Alignment with Cross-Document Attention by Xuhui Zhou, Nikolaos Pappas, and Noah A. Smith
Language:Python0 0 00
nissymori/D4RL
A collection of reference environments for offline reinforcement learning
Language:Python0 0 00
nissymori/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
Language:Python00
nissymori/mjai
Game server for Japanese Mahjong AI.
Language:Ruby0 0 00
nissymori/reinforce
A simple REINFORCE algorithm implementation in PyTorch
Language:Python0 0 00
nissymori/rejax
Language:Python00
nissymori/SRPO
[NeurIPS 2023] The official code for paper "State Regularized Policy Optimization on Data with Dynamics Shift"
Language:Python00
nissymori/td-gammon
TD-Gammon implementation
Language:Python0 0 00