vwxyzjn

RLHF @huggingface, CS Ph.D. from Drexel University in RL.

@huggingfacePhiladelphia, PA

Pinned Repositories

trl
Train transformer language models with reinforcement learning.
Language:Python8k 71 855952
cleanba
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL
Language:Python89 4 49
cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Language:Python4.4k 34 166520
gym-microrts-paper
The source code for the gym-microrts paper.
Language:Python38 4 63
invalid-action-masking
Source Code for A Closer Look at Invalid Action Masking in Policy Gradient Algorithms
Language:Python125 2 318
lm-human-preference-details
RLHF implementation details of OAI's 2019 codebase
Language:Python117 4 76
portwarden
Create Encrypted Backups of Your Bitwarden Vault with Attachments
Language:Go540 9 2831
PPO-Implementation-Deep-Dive
DEPRECATED - please visit https://github.com/vwxyzjn/ppo-implementation-details
Language:Python41 2 13
ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
Language:Python543 3 582
summarize_from_feedback_details
Language:Python74 4 09

vwxyzjn's Repositories

vwxyzjn/gym-microrts-paper
The source code for the gym-microrts paper.
Language:Python38 4 63
vwxyzjn/gym-pysc2
Gym wrapper for pysc2
Language:Python8 3 02
vwxyzjn/envpool-cleanrl
Language:Python7 2 01
vwxyzjn/cleangpt
Language:Python4 2 0
vwxyzjn/entity-ppo-demo
Language:Python2 3 0
vwxyzjn/baselines
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
Language:Python1 1 0
vwxyzjn/envpool-xla-cleanrl
Language:Python1 2 0
vwxyzjn/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python1 1 0
vwxyzjn/awesome-reinforcement-learning-lib
GitHub's code repository is all you need
1 01
vwxyzjn/dm-haiku
JAX-based neural network library
Language:Python1 0
vwxyzjn/dragonfly
A modern replacement for Redis and Memcached
Language:C++1 0
vwxyzjn/enn-trainer
Language:Python1 0
vwxyzjn/enn-zoo
Collection of entity-gym bindings for different reinforcement learning environments.
Language:Python1 0
vwxyzjn/entity-gym
Standard interface for entity based reinforcement learning environments.
Language:Python1 0
vwxyzjn/envpool
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
Language:C++1 0
vwxyzjn/flax
Flax is a neural network library for JAX that is designed for flexibility.
Language:Python1 0
vwxyzjn/Gymnasium
A standard API for reinforcement learning and a diverse set of reference environments (formerly Gym)
Language:Python1 0
vwxyzjn/hyperstate
Opinionated library for managing hyperparameters and mutable state of machine learning training systems.
Language:Python1 0
vwxyzjn/IsaacGymEnvs
Isaac Gym Reinforcement Learning Environments
Language:Python1 01
vwxyzjn/jaxrl
JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.
Language:Jupyter Notebook1 0
vwxyzjn/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
Language:Python1 0
vwxyzjn/moolib
A library for distributed ML training with PyTorch
Language:C++1 0
vwxyzjn/moolib-data
Language:Python2 0
vwxyzjn/poetry12bug
2 0
vwxyzjn/rl-experiments
Keeping track of RL experiments
1 0
vwxyzjn/rl_games
RL implementations
Language:Python1 0
vwxyzjn/rogue-net
Entity Gym compatible ragged batch transformer implementation.
Language:Python1 0
vwxyzjn/Shimmy
An API conversion tool for popular external reinforcement learning environments
Language:Python1 0
vwxyzjn/v23
Volume 23 of JMLR
1 0
vwxyzjn/vwxyzjn.github.io
Language:HTML2 1