pickxiguapi/Clean-Offline-RLHF
Offline RLHF codebase implementation for "Uni-RLHF: Universal Platform and Benchmark Suite for Reinforcement Learning with Diverse Human Feedback" (ICLR2024)
PythonMIT
Pinned issues
Issues
- 2
- 2
About the raw dataset of smarts
#2 opened by TU2021 - 2