Collection of LLM resources that can be used to build products you can "own" or to perform reproducible research. Please note there are Terms of Service around some of the weights and training data that should be investigated before commercialization.
Table of Contents:
- Leaderboards
- Local LLMs
- LLM-based Tools
- Training and Quantization
- Non-English Models
- Autonomous Agents
If you are looking for a list of open-source LLMs that can be used commercially, this is a great list: Open LLMs
I am having a hard time keeping up with the latest and greatest open-source LLMs. Below are leaderboards I am checking periodically:
-
HuggingFace Open LLM Leaderboard The 🤗 Open LLM Leaderboard aims to track, rank and evaluate LLMs and chatbots as they are released. (2023-05-23, HuggingFace)
-
AlpacaEval 🦙 Leaderboard An Automatic Evaluator for Instruction-following Language Models (2023-07-01, Stanford Alpaca/Tatsu Lab)
-
Code Generation on HumanEval HumanEval problem solving dataset described in the paper "Evaluating Large Language Models Trained on Code" (2023-07-01, Papers With Code)
-
LLM Foundry Release repo for MPT-7B and related models. (2023-05-05, MosaicML, Apache 2.0)
-
FastChat Release repo for Vicuna and FastChat-T5 (2023-04-20, LMSYS, Apache 2.0)
-
StabilityLM - Stability AI Language Models (2023-04-19, StabilityAI, Apache and CC BY-SA-4.0)
-
GPT4All - LLM trained with ~800k GPT-3.5-Turbo Generations based on GPT-J and LLaMa. (2023-04-13, Nomic AI, Apache/Meta ToS/OpenAI ToS)
-
Dolly - Large language model trained on the Databricks Machine Learning Platform (2023-03-24, Databricks Labs, Apache)
-
bloomz.cpp Inference of HuggingFace's BLOOM-like models in pure C/C++. (2023-03-16, Nouamane Tazi, MIT License)
-
alpaca.cpp - Locally run an Instruction-Tuned Chat-Style LLM (2023-03-16, Kevin Kwok, MIT License)
-
Stanford Alpaca - Code and documentation to train Stanford's Alpaca models, and generate the data. (2023-03-13, Stanford CRFM, Apache License, Non-Commercial Data, Meta/OpenAI ToS)
-
llama.cpp - Port of Facebook's LLaMA model in C/C++. (2023-03-10, Georgi Gerganov, MIT License)
-
ChatRWKV - ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source. (2023-01-09, PENG Bo, Apache License)
-
RWKV-LM - RNN with Transformer-level LLM performance. Combines best of RNN and transformer: fast inference, saves VRAM, fast training. (2022?, PENG Bo, Apache License)
-
Open Assistant - A chat-based ChatGPT-like large language model. (2023-04-15, Pythia, LLAMA, Apache 2.0 License)
-
RedPajama-Data-1T (2023-04-17, @togethercompute)
-
Vicuna 13b (2023-04-12, @lmsysorg)
-
Dolly 15k Instruction Tuning Labels (2023-04-12, DataBricks, CC3 Allows Commercial Use)
-
Cerebras-GPT 7 Models (2023-03-28, Huggingface, Cerebras, Apache License)
-
Alpine Data Cleaned (2023-03-21, Gene Ruebsamen, Apache & OpenAI ToS)
-
Alpaca Dataset (2023-03-13, Huggingface, Tatsu-Lab, Meta ToS/OpenAI ToS)
-
Alpaca Model Search (Huggingface, Meta ToS/OpenAI ToS)
-
Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs (2023-05-05, MosaicML, Blog Post)
-
Google "We Have No Moat, And Neither Does OpenAI" (2023-05-04, Leaked Internal Google Document)
-
RedPajama reproduces LLaMA training dataset of over 1.2 trillion tokens (2023-04-17, Together, Blog Post)
-
What’s in the RedPajama-Data-1T LLM training set (2023-04-17, Simon Willison, Blog Post)
-
GPT4All-J: An Apache-2 Licensed Assistant-Style Chatbot (2023-04-13, nomic.ai)
-
Databricks releases Dolly 2.0, the first open, instruction-following LLM for commercial use (2023-04-13, Venture Beat, Sharon Goldman)
-
Summary of Curent Models (2023-04-11, Dr Alan D. Thompson, Google Sheet)
-
Running GPT4All On a Mac Using Python langchain in a Jupyter Notebook (2023-04-04, Tony Hirst, Blog Post)
-
Vicuna Homepage (2023-04-01, Meta ToS)
-
Cerebras-GPT vs LLaMA AI Model Comparison (2023-03-29, LunaSec, Blog Post)
-
Cerebras-GPT: Family of Open, Compute-efficient, LLMs (2023-03-28, Cerebras, Blog Post)
-
Hello Dolly: Democratizing the magic of ChatGPT with open models (2023-03-24, databricks, Blog Post)
-
The Coming of Local LLMs (2023-03-23, Nick Arner, Blog Post)
-
The RWKV language model: An RNN with the advantages of a transformer (2023-03-23, Johan Sokrates Wind, Blog Post)
-
Bringing Whisper and LLaMA to the masses (2023-03-15, The Changelog & Georgi Gerganov, Podcast Episode)
-
Alpaca: A Strong, Replicable Instruction-Following Model (2023-03-13, Stanford CRFM, Project Homepage)
-
Large language models are having their Stable Diffusion moment (2023-03-10, Simon Willison, Blog Post)
-
Running LLaMA 7B and 13B on a 64GB M2 MacBook Pro with llama.cpp (2023-03-10, Simon Willison, Blog/Today I Learned)
-
Introducing LLaMA: A foundational, 65-billion-parameter large language model (2023-02-24, Meta AI, Meta ToS)
-
MiniGPT-4 Enhancing Vision-language Understanding with Advanced Large Language Models (2023-04-17, Vision CAIR Research Group, KAUST, BSD)
-
Text generation web UI A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. (2023-04-15, oobabooga, AGPL)
-
TextSynth REST API for Large Language Models. Supports variety of models. (2023-04-13, Fabrice Bellard, Commercial License GPU/Shareware CPU)
-
FastChat - The release repo for Vicuna: An Open Chatbot Impressing GPT-4 (2023-04-13, LM-SYS, Apache)
-
tabby Self-hosted AI coding assistant. (2023-04-12, TabbyML)
-
Basaran Open-source text completion API for Transformers-based text generation models. (2023-04-12, Hyperonym)
-
TurboPilot CoPilot clone that runs code completion 6B-LLM with CPU and 4GB of RAM. (2023-04-11, James Ravenscroft)
-
talkGPT4All - A voice chatbot based on OpenAI Whisper and GPT4All, running on local laptop. (2023-04-09, Yunfeng Wang, MIT License)
-
LLMZoo Data, models, and evaluation benchmark for large language models (2023-04-08, FreedomIntelligence, Apache)
-
LMFlow An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. (2023-04-06, OptimalScale)
-
xturing - Build and control your own LLMs (2023-04-03, stochastic.ai)
-
DeepSpeed Deep learning optimization library that makes distributed training and inference easy. (2023-04-13, Microsoft, Apache)
-
GPTQ-for-LLaMA - 4 bits quantization of LLaMA using GPTQ (2023-04-01, qwopqwop200, Meta ToS)
-
GPTQ Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers". (2023-03-22, IST Austria Distributed Algorithms and Systems Lab)
-
Polpaca - Alpaca Speaks Polish (2023-04-13, Marcin Mosiolek)
-
KOZA KOZA is an instruct model for Polish language forked from alpaca-lora. (2023-04-13, Leszek Bukowski)
-
Owca is a Polish-translated dataset of instructions for fine-tuning the Alpaca model (2023-04-13, Emplocity)
-
AI Legion - JS/TS framework for autonomous agents who can work together to accomplish tasks. (2023-04-13, eumemic, MIT)
-
AgentGPT - Assemble, configure, and deploy autonomous AI Agents in your browser. (2023-04-12, Rework.ai)
-
babyagi - Python script example of AI-powered task management system. Uses OpenAI and Pinecone APIs to create, prioritize, and execute tasks. (2023-04-06, Yohei Nakajima)
-
ChatArena - Multi-Agent Language Game Environments for LLMs. (2023-04-05, UCL)
-
Auto-GPT - An experimental open-source attempt to make GPT-4 fully autonomous. (2023-04-06, Toran Bruce Richards)
-
JARVIS - JARVIS, a system to connect LLMs with ML community (2023-04-06, Microsoft)
-
Autolang - Based on BabyAGI, focused on workflows that complete. Powered by langchain. (2023-04-10, Alvaro Sevilla)
-
Emergent autonomous scientific research capabilities of large language models (2023-04-11, Daniil A. Boiko,1 Robert MacKnight, and Gabe Gomes - Carnegie Mellon University)
-
Generative Agents: Interactive Simulacra of Human Behavior (2023-04-07, Stanford and Google)
-
Twitter List: Homebrew AGI Club (2023-04-06, @altryne]
-
LangChain: Custom Agents (2023-04-03, LangChain)
-
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace (2023-04-02, Microsoft)
-
Introducing Agents in Haystack: Make LLMs resolve complex tasks (2023-03-30, Haystack and Deepset)
-
Introducing "🤖 Task-driven Autonomous Agent" (2023-03-29, @yoheinakajima)
-
A simple Python implementation of the ReAct pattern for LLMs (2023-03-17, Simon Willison)
-
ReAct: Synergizing Reasoning and Acting in Language Models (2023-03-10, Princeton & Google)