Awesome Open Source LLM + IFT (+RLHF)

This is a collection of open source implementations of LLMs with IFT and RLHF that are striving to get to ChatGPT level of performance. Some observations that might help with trying out the demos quickly for yourself:

Most of the Colab notebooks require a Colab Pro account ($9.99/month) to get Premium GPU access
Quantized GPT4All can run on a laptop CPU thanks to llama.cpp
LoRA models are fine-tuneable on consumer hardware (e.g. RTX4090) whereas the non-LoRA models seem to require 8-10 hours on 8xA100 systems (costing <$100 of compute time)

Commercial-use models

Name	Base model	IFT	IFT data	RLHF	LoRA	Quantization	Commercial Use	Links
Llama-2	Llama-2 7B 13B 70B	✅	Meta Anthropic OpenAI	✅	✅	✅	✅	Model Paper
Falcon	Falcon-40B instruct	✅	Baize	❌	❌	❌	✅	Model
MPT	MPT-7B instruct	✅	dolly-15k Anthropic	❌	❌	❌	✅	Spaces
Dolly 2.0	Pythia-12B	✅	dolly-15k	❌	❌	❌	✅	Model Github
OpenChatKit	Pythia 7B GPT-NeoXT-20B	✅	LAION OIG	❌	❌	✅	✅	Spaces Github
Open Assistant	Pythia 12B	✅	OASST1	✅	❌	❌	✅	Demo Model Github

Non commercial-use models

Name	Base model	IFT	IFT data	RLHF	LoRA	Quantization	Commercial Use	Links
Alpaca	Llama 7B Llama 13B	✅	Alpaca (davinci-003) gpt-4	❌	❌	❌	❌	Alpaca model GPT-4 model
Vicuna	Llama 13B	✅	ShareGPT	❌	❌	❌	❌	Demo Github
Koala EasyLM	Llama 13B	✅	Alpaca ShareGPT HC3 LAION OIG Anthropic WebGPT Summaries	❌	❌	❌	❌	Demo Github
Alpaca+LORA	Llama 7B	✅	Alpaca (davinci-003) Cleaned	❌	✅	❌	❌	Spaces Github
Baize	Llama 7B Llama 13B Llama 30B	✅	gpt-3.5-turbo	❌	✅	❌	❌	Spaces Github
GPT4All	Llama 7B	✅	gpt-3.5	❌	✅	✅	❌	Github
Instruct GPT-J+LoRA	GPT-J-6B	✅	Alpaca (davinci-003)	❌	✅	❌	❌	Colab Model
Dolly	GPT-J-6B	✅	Alpaca (davinci-003)	❌	❌	❌	❌	Model Github
Dolly+LoRA	GPT-J-6B	✅	Alpaca (davinci-003) Cleaned	❌	✅	❌	❌	Colab
ColossalChat	Llama 7B	✅		✅	✅	✅	❌	Demo Github
ChatRMKV	RMKV RNN based	✅	Alpaca	❌	❌	✅	❌	Spaces Github
StableLM	StableLM-base	✅	Alpaca, GPT4All, Dolly, ShareGPT, and HH	❌	❌	❌	❌	Spaces Github
MPT	MPT-7B chat	✅	Anthropic Vicuna Alpaca HC3 Evol-instruct	❌	❌	❌	❌	Spaces
Free Willy	Llama	✅	COT, NIV2 FLAN'21, T0	❌	❌	❌	❌	Model

Code only

Name	Base model	IFT	IFT data	RLHF	LoRA	Quantization	Commercial Use	Links
TRL-PEFT		✅		✅	✅	✅		code only, no model
DeepSpeed Chat	OPT	✅		✅	✅	✅		code only, no model

Benchmarks

The following resources maintain active benchmarks of the above and similar models:

arjunbansal/awesome-oss-llm-ift-rlhf

Awesome Open Source LLM + IFT (+RLHF)

Commercial-use models

Non commercial-use models

Code only

Benchmarks