tatsu-lab/alpaca_farm
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
PythonApache-2.0
Stargazers
- 99starmanUniversity of British Columbia
- AndromedaPerseusUnited States
- andysalernoMicrosoft
- assimelhaGothenburg, Sweden
- augustintoma
- backpropperGoogle DeepMind
- DinoDefend
- ekryski@bidalihq
- emigmoTsinghua University
- feifeiobamaPeking University
- gmittalPalo Alto, CA
- gordonhu608UCLA
- guykhazma@IBM
- hugo-alvesLisbon
- huyangqiu
- ittailup@emptor
- jackcookThe New York Times
- jd-hernandezEncora, Inc.
- kennyfrc@Monolith-Growth-Consulting
- kunatoKUNANA AI
- lxuechenStanford University
- mzntaka0
- OhadCohen97Israel
- PWhiddySeattle WA
- ryoungjUniversity of Toronto
- shanbadyMIT Open Learning
- shyamsn97
- siva-fincent
- smellslikemlSmellsLikeML
- stevesolun
- tokestermwCresta
- u-brixton
- varunshenoy
- xiaowu0162UCLA
- yotamnahum@Samplead
- zwhe99Shanghai Jiao Tong University