/chat-ltu

Open source implementation of InstructGPT (not finished)

Primary LanguagePythonMIT LicenseMIT

Update: This implementation is not finished and I will look to finish it once I have more time on my hand.

Chat-LTU

This is a chatbot project for the course D7058E at Luleå Univeristy of Technology. We try to implement something similar to Instruct-GPT or Chat-GPT mostly based on the papers and the rlhf blogpost from Huggingface.

Todo:

  • Implement PPO2 for faster RL fine-tuning.
  • Implement the website that is partially done to gather real human data.
  • Upload reward model and fine-tuned model to Huggingface for open source use.