Pinned Repositories
ClORL
Authors' implementation of the "Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?"
GSM8K-AI-SubQ
Author's repository for GSM8K-AI-SubQ reasoning dataset
Mithril
MithrilApp
multi-agent-emergence-unity
Reproduction of environment from paper "Emergent Tool Use From Multi-Agent Autocurricula" with Unity and ML-Agents
ReBRAC
Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC
TeamStatisticsPlugin
TeamStatisticsPluginServer
Django server
CORL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
ReBRAC
Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC
DT6A's Repositories
DT6A/ReBRAC
Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC
DT6A/ClORL
Authors' implementation of the "Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?"
DT6A/multi-agent-emergence-unity
Reproduction of environment from paper "Emergent Tool Use From Multi-Agent Autocurricula" with Unity and ML-Agents
DT6A/TeamStatisticsPlugin
DT6A/TeamStatisticsPluginServer
Django server
DT6A/GSM8K-AI-SubQ
Author's repository for GSM8K-AI-SubQ reasoning dataset
DT6A/fl_2020_hse_win
Репозиторий для курса формальных языков
DT6A/Hands-On_Machine_Learning
Solutions to exercises from the book "Hands-On Machine Learning with Scikit-Learn and TensorFlow"
DT6A/HSE_OS
DT6A/Java-Servers
Comparison of different server architectures
DT6A/JBMAHS
DT6A/ActoReg
DT6A/awesome-offline-rl
An index of algorithms for offline reinforcement learning (offline-rl)
DT6A/CIL
DT6A/CORL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
DT6A/d4rl
A benchmark for offline reinforcement learning.
DT6A/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
DT6A/DSS_HSE
DT6A/DT6A.github.io
DT6A/GrokkingDRL
DT6A/HSE_Python
DT6A/HSE_SD
DT6A/IQL-PyTorch
A PyTorch implementation of Implicit Q-Learning
DT6A/mathematical-statistics
DT6A/nlp_course_project
DT6A/optax
Optax is a gradient processing and optimization library for JAX.
DT6A/plotting
DT6A/Recsys-course-homework_2022
DT6A/SD-2022
Software Design HW
DT6A/socratic-generation
Automatic Generation of Scaffolding Questions for Learning Math, EMNLP 2022