/LLM_Reward_Model

Developing a LLM response ranking reward model using HFRL except it's GPT-3.5 instead of human.

Primary LanguageJupyter Notebook

Stargazers