Edward-Sun/easy-to-hard
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
PythonBSD-3-Clause
Issues
- 1
Question about REST-EM
#9 opened by mandyyyyii - 1
- 6
question about reward score
#7 opened by DecideToLeave - 0
prm loss变化
#6 opened by DecideToLeave - 0
- 4
Two questions about the article
#4 opened by xiaolizh1 - 1
About the training scripts.
#3 opened by Zjshadow - 2
readme for data
#2 opened by rguo12 - 1