Pinned Repositories
trl
Train transformer language models with reinforcement learning.
cleanba
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL
cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
gym-microrts-paper
The source code for the gym-microrts paper.
invalid-action-masking
Source Code for A Closer Look at Invalid Action Masking in Policy Gradient Algorithms
lm-human-preference-details
RLHF implementation details of OAI's 2019 codebase
portwarden
Create Encrypted Backups of Your Bitwarden Vault with Attachments
PPO-Implementation-Deep-Dive
DEPRECATED - please visit https://github.com/vwxyzjn/ppo-implementation-details
ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
summarize_from_feedback_details
vwxyzjn's Repositories
vwxyzjn/PPO-Implementation-Deep-Dive
DEPRECATED - please visit https://github.com/vwxyzjn/ppo-implementation-details
vwxyzjn/a2c_is_a_special_case_of_ppo
A2C is a special case of PPO!
vwxyzjn/vectorized-value-methods
[WIP] Vectorized architecture for value-based methods such as DQN and DDPG
vwxyzjn/launcha
Launcha is a simple Docker-based cloud job launcher.
vwxyzjn/validate-new-gym-mujoco-envs
vwxyzjn/Arcade-Learning-Environment
The Arcade Learning Environment (ALE) -- a platform for AI research.
vwxyzjn/birthday
A Happy Birthday animation design in CSS3, HTML5
vwxyzjn/brax
Massively parallel rigidbody physics simulation on accelerator hardware.
vwxyzjn/composer
library of algorithms to speed up neural network training
vwxyzjn/container-apps-store-api-microservice
Sample microservices solution using Azure Container Apps, Dapr, Cosmos DB, and Azure API Management
vwxyzjn/draw.io
vwxyzjn/environment
Neural MMO - A Massively Multiagent Environment for Artificial Intelligence Research
vwxyzjn/gym
A toolkit for developing and comparing reinforcement learning algorithms.
vwxyzjn/gym-docs
Code for Gym documentation website
vwxyzjn/gym-microrts-paper-sb3
RL agent to play μRTS with Stable-Baselines3
vwxyzjn/gym-microrts-static-files
vwxyzjn/gym-robotics
vwxyzjn/iclr-blog-track.github.io
vwxyzjn/incubator
Collection of in-progress libraries for entity neural networks.
vwxyzjn/isort
A Python utility / library to sort imports.
vwxyzjn/launcha-sb3-example
vwxyzjn/MA-ALE2
vwxyzjn/microrts-sb3
vwxyzjn/minihack
MiniHack the Planet: A Sandbox for Open-Ended Reinforcement Learning Research
vwxyzjn/MultiAgentObjectCollectorEnv
vwxyzjn/nmmo-cleanrl-incubator
vwxyzjn/PPO-Procgen-Reproduction
vwxyzjn/stable-baselines3
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
vwxyzjn/stable-baselines3-contrib
Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code
vwxyzjn/tianshou
An elegant PyTorch deep reinforcement learning library.