vwxyzjn

RLHF @allenai, CS Ph.D. from Drexel University in RL.

@huggingfacePhiladelphia, PA

Pinned Repositories

trl
Train transformer language models with reinforcement learning.
Language:Python10.2k 77 1.2k1.3k
cleanba
CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL
Language:Python105 4 511
cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Language:Python5.7k 38 185647
gym-microrts-paper
The source code for the gym-microrts paper.
Language:Python42 4 63
invalid-action-masking
Source Code for A Closer Look at Invalid Action Masking in Policy Gradient Algorithms
Language:Python139 2 322
lm-human-preference-details
RLHF implementation details of OAI's 2019 codebase
Language:Python154 4 78
portwarden
Create Encrypted Backups of Your Bitwarden Vault with Attachments
Language:Go590 10 3033
PPO-Implementation-Deep-Dive
DEPRECATED - please visit https://github.com/vwxyzjn/ppo-implementation-details
Language:Python45 2 13
ppo-implementation-details
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
Language:Python649 3 699
summarize_from_feedback_details
Language:Python115 4 214

vwxyzjn/awesome-vue
A curated list of awesome things related to Vue.js
1 2 0
vwxyzjn/vuetify-parallax-starter2
Language:JavaScript1 3 0
vwxyzjn/Abstract_Algebra_Finite_Group_Generator
A brute force program that enumerates all possible permutations of binary operations on a given set.
Language:Python2 01
vwxyzjn/assignment1-demo
Language:JavaScript2 0
vwxyzjn/Costa_Tornado_Blog
Language:HTML2 0
vwxyzjn/docs
Documentation for Vuetify.js
Language:Vue2 0
vwxyzjn/genetics-algorithm
Language:JavaScript2 01
vwxyzjn/histraffic
Language:HTML2 0
vwxyzjn/pythonVSCode
Cross platform editing, debugging, linting, testing (and more) Python (2.7 to 3.6) code (including Jupyter support) using Visual Studio Code
Language:Python2 0
vwxyzjn/Reproduction_of_newsfeed.fb.com
This is a reproduction of newsfeed.fb.com by using GSAP and Vue.js
Language:Vue2 0
vwxyzjn/v-card-media_relative_import
Language:Vue2 0
vwxyzjn/vue-slider-component
Can use the slider in vue1.x and vue2.x（vue滑块组件）
Language:Vue2 0
vwxyzjn/vuetify-landing-starter
Language:JavaScript2 0