WooooDyy/LLM-Reverse-Curriculum-RL
Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" presented by Zhiheng Xi et al.
Python
Stargazers
- air-balls
- albzni
- ChangyuChen347
- ChaoPeng13china
- chenwxOggaiFudan University, Shanghai
- CytAI
- DanielPhoton
- denisfitz57
- EVAN-LI98
- hanningzhangHong Kong
- HBY-hub
- HongruiFanCentral China Normal University
- hot-zhyEast China Normal University
- JeffCarpenterCanada
- jindc中国.北京
- kleinzcy
- MasterVitoTsinghua University
- MrZhengXinUniversity of Chinese Academy of Sciences
- ParadoxZWHangzhou, China
- rich-junwang
- SHR238
- SparkJiaoNTU-NLP & I2R, A*STAR, Singapore
- taichengguoUniversity of Notre Dame
- tokarev-i-v
- tongjingqi
- WangHanLinHenrySouth China University of Technology
- william11ya
- WooooDyyFudan University
- XinGuo2002Fudan University
- xipqMSFT
- xmzhaoTencent
- yhc582825016
- zhanghaoie
- zhongly1021Ann Arbor
- zhoujz10
- zzhang0179Fudan University