DT6A

ETH ZurichZurich, Switzerland

Pinned Repositories

ClORL
Authors' implementation of the "Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?"
Language:Python7 1 00
GSM8K-AI-SubQ
Author's repository for GSM8K-AI-SubQ reasoning dataset
Language:Python20
Mithril
Language:C2 3 00
MithrilApp
Language:Kotlin2 1 00
multi-agent-emergence-unity
Reproduction of environment from paper "Emergent Tool Use From Multi-Agent Autocurricula" with Unity and ML-Agents
Language:C#7 3 40
ReBRAC
Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC
Language:Jupyter Notebook11 0 00
TeamStatisticsPlugin
Language:Java3 4 01
TeamStatisticsPluginServer
Django server
Language:Python3 2 00
CORL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
Language:Python1.1k 16 28124
ReBRAC
Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC
Language:Jupyter Notebook50 2 06

DT6A's Repositories

DT6A/ReBRAC
Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC
Language:Jupyter Notebook11 0 00
DT6A/ClORL
Authors' implementation of the "Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?"
Language:Python7 1 00
DT6A/multi-agent-emergence-unity
Reproduction of environment from paper "Emergent Tool Use From Multi-Agent Autocurricula" with Unity and ML-Agents
Language:C#7 3 40
DT6A/TeamStatisticsPlugin
Language:Java3 4 01
DT6A/TeamStatisticsPluginServer
Django server
Language:Python3 2 00
DT6A/GSM8K-AI-SubQ
Author's repository for GSM8K-AI-SubQ reasoning dataset
Language:Python20
DT6A/fl_2020_hse_win
Репозиторий для курса формальных языков
1 1 00
DT6A/Hands-On_Machine_Learning
Solutions to exercises from the book "Hands-On Machine Learning with Scikit-Learn and TensorFlow"
Language:Jupyter Notebook1 2 0
DT6A/HSE_OS
Language:C++1 2 0
DT6A/Java-Servers
Comparison of different server architectures
Language:Java1 1 0
DT6A/JBMAHS
Language:Jupyter Notebook1 2 0
DT6A/ActoReg
Language:Python
DT6A/awesome-offline-rl
An index of algorithms for offline reinforcement learning (offline-rl)
0 0
DT6A/CIL
Language:Python
DT6A/CORL
High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC
Language:Python0 0
DT6A/d4rl
A benchmark for offline reinforcement learning.
Language:Python0 0
DT6A/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
Language:Python0 0
DT6A/DSS_HSE
Language:C++1 0
DT6A/DT6A.github.io
Language:HTML1 0
DT6A/GrokkingDRL
Language:Jupyter Notebook1 0
DT6A/HSE_Python
Language:Python1 0
DT6A/HSE_SD
1 0
DT6A/IQL-PyTorch
A PyTorch implementation of Implicit Q-Learning
Language:Python0 0
DT6A/mathematical-statistics
Language:Jupyter Notebook1 0
DT6A/nlp_course_project
Language:Jupyter Notebook1 0
DT6A/optax
Optax is a gradient processing and optimization library for JAX.
Language:Python0 0
DT6A/plotting
DT6A/Recsys-course-homework_2022
Language:Jupyter Notebook0 0
DT6A/SD-2022
Software Design HW
0 0
DT6A/socratic-generation
Automatic Generation of Scaffolding Questions for Learning Math, EMNLP 2022
Language:Python0 0