Update: This implementation is not finished and I will look to finish it once I have more time on my hand.

Chat-LTU

This is a chatbot project for the course D7058E at Luleå Univeristy of Technology. We try to implement something similar to Instruct-GPT or Chat-GPT mostly based on the papers and the rlhf blogpost from Huggingface.

Todo:

Implement PPO2 for faster RL fine-tuning.
Implement the website that is partially done to gather real human data.
Upload reward model and fine-tuned model to Huggingface for open source use.

flippe3/chat-ltu

Chat-LTU

Todo: