feyzaakyurek/rl4f
Code for RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs. ACL 2023.
PythonMIT
Stargazers
- 0xKodaMempool
- akkikikiAWS AI Labs
- altriasjy31m78
- andres-ramirez-duque
- AndySumAustralia
- atyenoriaVoicePing
- bobbercheng
- dreasysnailApple
- dumpmemory
- EthenZhang
- fly51flyPRIS
- fukexueFudan University
- Gary-codeSCUT
- inimahIndonesian Institute of Sciences (LIPI)
- ionflow
- jokieleungSYSU
- junyz0
- kriskumar
- lijie2160北京交通大学
- lingtingSir@美团点评
- liuqi8827Harbin Institute of Technology
- MilleniumSpark
- odellus@phytomech
- provRaminHamediZaviehgard
- q121q
- SandalotsVolcanak
- sing1eeBillions Tech
- taesiriPlanet Mars
- taishan1994flow++
- TheodorosGalanos
- we1l1n
- wjhouPolyU & SUSTech
- xiami2019Fudan University&Sun Yat-Sen University
- YinpeiDaiComputer Science
- yotamnahum@Samplead
- ZubinGouTsinghua University