nottombrown/rl-teacher
Code for Deep RL from Human Preferences [Christiano et al]. Plus a webapp for collecting human feedback
PythonMIT
Watchers
- arnav13081994
- berkeisinsh
- calferraro
- changkunResearcher @mimuc and Engineer @Sixt
- cookiegg
- cristianopris
- dalamar66Consultant
- djmartingale
- drkostasUniversity of Tennessee, Knoxville
- gandalfvn
- gxhrid
- Hao-HUSTHuawei Inc
- jackclarksf
- lenyablokoYabloko US Lab
- manolazSaigon,Vietnam
- mihowPortland, OR
- mli2003
- mwcmOttawa, Canada
- ngunsuUniversidad San Sebastian
- nottombrownAnthropic
- pierg
- planetceresSan Francisco Bay Area
- qifeng2010
- RaelifinBerkeley, California, USA
- RichardKelleyReno, NV
- RoottanShanghai CN
- roschler
- RubinOrlando
- Sampson91
- sidenoteemail
- slightperturbationSan Francisco, CA
- taewanleePortland, OR
- talegariGames24x7
- tigerneilCenter for Safe AGI
- wx-bRIOS
- zerohistoryTokyo, Japan