Pinned Repositories
Adam-mini
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
astro_config
astronvim
AstroNvim template (v4+)
astrovim_config
basalt-competition
Second place submission in the 2021 MineRL BASALT competition: training Minecraft agents on hard to specify tasks from demonstration, using Inverse soft-Q Learning for Imitation.
BCO
behavior cloning from observation
BCO-Fetch
PyTorch Behavioral Cloning from Observation for MuJoCo Fetch Environments
Deep-Learning-Project-Template
A best practice for deep learning project template architecture.
tmux_config
ErlebnisW's Repositories
ErlebnisW/Adam-mini
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
ErlebnisW/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
ErlebnisW/astronvim
AstroNvim template (v4+)
ErlebnisW/astrovim_config
ErlebnisW/CloseAirCombat_baseline
An environment based on JSBSIM aimed at one-to-one close air combat.
ErlebnisW/DRL-code-pytorch
Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.
ErlebnisW/CMD
ErlebnisW/ElegantRL
Massively Parallel Deep Reinforcement Learning. 🔥
ErlebnisW/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
ErlebnisW/fxp
Fictitious Cross-Play
ErlebnisW/Grounding_LLMs_with_online_RL
We perform functional grounding of LLMs' knowledge in BabyAI-Text
ErlebnisW/HARL
Official implementation of HARL algorithms based on PyTorch.
ErlebnisW/ISP-reID
ISP-reID
ErlebnisW/light-marl-baseline
Concise pytorch implements of MARL algorithms, including MAPPO, MADDPG, MATD3, QMIX and VDN.
ErlebnisW/LlamaGym-py
Fine-tune LLM agents with online reinforcement learning
ErlebnisW/MARLlib
One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)
ErlebnisW/Mirror-Descent-in-MARL
ErlebnisW/MixEval
The official evaluation suite and dynamic data release for MixEval.
ErlebnisW/nvchad
Starter config for NvChad
ErlebnisW/OfflineRL-Lib
Benchmarked implementations of Offline RL Algorithms.
ErlebnisW/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
ErlebnisW/poker
Google Research Football MARL Benchmark and Research Toolkit
ErlebnisW/policy-space-diversity-psro
ErlebnisW/pymnash
Python library for finding Nash equilibria of multiplayer games
ErlebnisW/RE-Control
ErlebnisW/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
ErlebnisW/Safe-RLHF
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
ErlebnisW/smac
SMAC: The StarCraft Multi-Agent Challenge
ErlebnisW/sp-psro
ErlebnisW/trl
Train transformer language models with reinforcement learning.