/awesome-oss-llm-ift-rlhf

Collection of open source implementations of LLMs with IFT and RLHF that are striving to get to ChatGPT level of performance

MIT LicenseMIT

Awesome Open Source LLM + IFT (+RLHF)

This is a collection of open source implementations of LLMs with IFT and RLHF that are striving to get to ChatGPT level of performance. Some observations that might help with trying out the demos quickly for yourself:

  • Most of the Colab notebooks require a Colab Pro account ($9.99/month) to get Premium GPU access
  • Quantized GPT4All can run on a laptop CPU thanks to llama.cpp
  • LoRA models are fine-tuneable on consumer hardware (e.g. RTX4090) whereas the non-LoRA models seem to require 8-10 hours on 8xA100 systems (costing <$100 of compute time)

Commercial-use models

Name Base model IFT IFT data RLHF LoRA Quantization Commercial Use Links
Llama-2 Llama-2 7B 13B 70B Meta
Anthropic
OpenAI
Model
Paper
Falcon Falcon-40B instruct Baize Model
MPT MPT-7B instruct dolly-15k
Anthropic
Spaces
Dolly 2.0 Pythia-12B dolly-15k Model
Github
OpenChatKit Pythia 7B
GPT-NeoXT-20B
LAION OIG Spaces
Github
Open Assistant Pythia 12B
OASST1 Demo
Model
Github

Non commercial-use models

Name Base model IFT IFT data RLHF LoRA Quantization Commercial Use Links
Alpaca Llama 7B
Llama 13B
Alpaca (davinci-003)
gpt-4
Alpaca model
GPT-4 model
Vicuna Llama 13B ShareGPT Demo
Github
Koala
EasyLM
Llama 13B Alpaca
ShareGPT
HC3
LAION OIG
Anthropic
WebGPT
Summaries
Demo
Github
Alpaca+LORA Llama 7B Alpaca (davinci-003) Cleaned Spaces
Github
Baize Llama 7B
Llama 13B
Llama 30B
gpt-3.5-turbo Spaces
Github
GPT4All Llama 7B gpt-3.5 Github
Instruct GPT-J+LoRA GPT-J-6B Alpaca (davinci-003) Colab
Model
Dolly GPT-J-6B Alpaca (davinci-003) Model
Github
Dolly+LoRA GPT-J-6B Alpaca (davinci-003) Cleaned Colab
ColossalChat Llama 7B Demo
Github
ChatRMKV RMKV
RNN based
Alpaca Spaces
Github
StableLM StableLM-base Alpaca, GPT4All, Dolly, ShareGPT, and HH Spaces
Github
MPT MPT-7B chat Anthropic
Vicuna
Alpaca
HC3
Evol-instruct
Spaces
Free Willy Llama COT, NIV2
FLAN'21, T0
Model

Code only

Name Base model IFT IFT data RLHF LoRA Quantization Commercial Use Links
TRL-PEFT code only, no model
DeepSpeed Chat OPT code only, no model

Benchmarks

The following resources maintain active benchmarks of the above and similar models: