reward-model

There are 7 repositories under reward-model topic.

Westlake-AI/SemiReward
[ICLR 2024] SemiReward: A General Reward Model for Semi-supervised Learning
Language:Python60 2 12
rochitasundar/Generative-AI-with-Large-Language-Models
This repository contains the lab work for Coursera course on "Generative AI with Large Language Models".
Language:Jupyter Notebook10 1 03
hlp-ai/miniChatGPT
Mini ChatGPT
Language:Python6 1 11
taishan1994/Reward-Model-Finetuning
专门用于训练奖励模型的仓库。
Language:Python2 1 0
techandy42/LLM_Reward_Model
Developing a LLM response ranking reward model using HFRL except it's GPT-3.5 instead of human.
Language:Jupyter Notebook2 1 00
jddunn/rlhf
POC library built on TextRL for easy training and usage of fine-tuned models using RLHF, a rewards model, and PPO
Language:Python0 2 00
thisisHJLee/RLHF
1 0