pickxiguapi/Clean-Offline-RLHF

Offline RLHF codebase implementation for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)

PythonMIT

Pinned issues

Missing code and data for certain feedback types

#3 opened 5 months ago by thomas475

Closed2

Issues

Missing code and data for certain feedback types
#3 opened 5 months ago by thomas475
2
About the raw dataset of smarts
#2 opened 5 months ago by TU2021
2
How to add new environments and generate the corresponding dataset
#1 opened 8 months ago by Jasonxu1225
2