facebookresearch/RLCD
Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment
PythonMIT
Stargazers
- 2khaBambou Tree Group
- abaheti95Databricks
- adhiiisetiawan@KalbeDigitalLab
- antx-code
- brotherbBaidu Inc.
- carpalmarAdaptativa SAC
- chenweixin107Tsinghua University
- cloudwaysXUniversity of Washington
- dmarxStability.ai, Eleuther.ai
- e0397123National University of Singapore (ECE-HLT)
- fblissjrIndependent AI Strategy, Advisory, & Research
- fly51flyPRIS
- gao-xiao-bai
- GitHub30Osaka, Japan
- jinny1208POSTECH
- Kaicheng-Yang0828DeepGlint
- LeeJodie
- lu-m13Intel Labs China
- mhnazeri
- ouhenioCentro Nacional de Inteligencia Artificial
- peterjc123Shanghai, China
- peterpaniffShenzhen
- phtvoHo Chi Minh, Viet Nam
- ptnv-s
- ryantd@kwai
- saimarpaka
- seshurajup@dolcera
- Siris2314
- sjYoondeltarSeoul
- SSshuishuiBeihang University
- superfan89Beijing, China
- SushantDaga
- Thornhill-GYL
- vTuanphamFPTU HCM
- xin-li-67California, US
- Zsbyqx20Institute for AI Industry Research, Tsinghua Univ