ErlebnisW

Pinned Repositories

Adam-mini
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
Language:Python0 0 00
alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Language:Jupyter Notebook0 0 00
astro_config
Language:Lua0 1 00
astronvim
AstroNvim template (v4+)
Language:Lua00
astrovim_config
Language:Lua0 1 00
basalt-competition
Second place submission in the 2021 MineRL BASALT competition: training Minecraft agents on hard to specify tasks from demonstration, using Inverse soft-Q Learning for Imitation.
Language:Python0 0 00
BCO
behavior cloning from observation
Language:Jupyter Notebook0 0 00
BCO-Fetch
PyTorch Behavioral Cloning from Observation for MuJoCo Fetch Environments
Language:Python0 0 00
Deep-Learning-Project-Template
A best practice for deep learning project template architecture.
Language:Python0 0 00
tmux_config
0 0 00

ErlebnisW's Repositories

ErlebnisW/Adam-mini
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
Language:Python0 0 00
ErlebnisW/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Language:Jupyter Notebook0 0 00
ErlebnisW/astronvim
AstroNvim template (v4+)
Language:Lua00
ErlebnisW/astrovim_config
Language:Lua0 1 00
ErlebnisW/CloseAirCombat_baseline
An environment based on JSBSIM aimed at one-to-one close air combat.
Language:Python0 0 00
ErlebnisW/DRL-code-pytorch
Concise pytorch implements of DRL algorithms, including REINFORCE, A2C, DQN, PPO(discrete and continuous), DDPG, TD3, SAC.
Language:Python0 0 00
ErlebnisW/CMD
Language:Python0 0
ErlebnisW/ElegantRL
Massively Parallel Deep Reinforcement Learning. 🔥
Language:Python0 0
ErlebnisW/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python0 0
ErlebnisW/fxp
Fictitious Cross-Play
Language:Python0 0
ErlebnisW/Grounding_LLMs_with_online_RL
We perform functional grounding of LLMs' knowledge in BabyAI-Text
Language:Python0 0
ErlebnisW/HARL
Official implementation of HARL algorithms based on PyTorch.
Language:Python0 0
ErlebnisW/ISP-reID
ISP-reID
Language:Python
ErlebnisW/light-marl-baseline
Concise pytorch implements of MARL algorithms, including MAPPO, MADDPG, MATD3, QMIX and VDN.
Language:Python0 0
ErlebnisW/LlamaGym-py
Fine-tune LLM agents with online reinforcement learning
Language:Python0 0
ErlebnisW/MARLlib
One repository is all that is necessary for Multi-agent Reinforcement Learning (MARL)
Language:Python0 0
ErlebnisW/Mirror-Descent-in-MARL
ErlebnisW/MixEval
The official evaluation suite and dynamic data release for MixEval.
Language:Python
ErlebnisW/nvchad
Starter config for NvChad
Language:Lua0 0
ErlebnisW/OfflineRL-Lib
Benchmarked implementations of Offline RL Algorithms.
ErlebnisW/OpenRLHF
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
Language:Python
ErlebnisW/poker
Google Research Football MARL Benchmark and Research Toolkit
Language:Python0 0
ErlebnisW/policy-space-diversity-psro
Language:Python0 0
ErlebnisW/pymnash
Python library for finding Nash equilibria of multiplayer games
Language:Python0 0
ErlebnisW/RE-Control
Language:Python0 0
ErlebnisW/RLHF-Reward-Modeling
Recipes to train reward model for RLHF.
Language:Python
ErlebnisW/Safe-RLHF
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Language:Python0 0
ErlebnisW/smac
SMAC: The StarCraft Multi-Agent Challenge
Language:Python0 0
ErlebnisW/sp-psro
Language:Python0 0
ErlebnisW/trl
Train transformer language models with reinforcement learning.
Language:Python0 0