Awesome LLM

Awesome series for Large Language Model(LLM)s

Models
Datasets
Benchmarks
Materials
Contributing

Models

Overview

Name	Parameter size	Announcement date
BERT-Large (336M)	336 million	2018
T5 (11B)	11 billion	2020
Gopher (280B)	280 billion	2021
GPT-J (6B)	6 billion	2021
LaMDA (137B)	137 billion	2021
Megatron-Turing NLG (530B)	530 billion	2021
T0 (11B)	11 billion	2021
Macaw (11B)	11 billion	2021
GLaM (1.2T)	1.2 trillion	2021
T5 FLAN (540B)	540 billion	2022
OPT-175B (175B)	175 billion	2022
ChatGPT (175B)	175 billion	2022
GPT 3.5 (175B)	175 billion	2022
AlexaTM (20B)	20 billion	2022
Bloom (176B)	176 billion	2022
Bard	Not yet announced	2023
GPT 4	Not yet announced	2023
AlphaCode (41.4B)	41.4 billion	2022
Chinchilla (70B)	70 billion	2022
Sparrow (70B)	70 billion	2022
PaLM (540B)	540 billion	2022
NLLB (54.5B)	54.5 billion	2022
Alexa TM (20B)	20 billion	2022
Galactica (120B)	120 billion	2022
UL2 (20B)	20 billion	2022
Jurassic-1 (178B)	178 billion	2022
LLaMA (65B)	65 billion	2023
Stanford Alpaca (7B)	7 billion	2023
GPT-NeoX 2.0 (20B)	20 billion	2023
BloombergGPT	50 billion	2023
Dolly	6 billion	2023
Jurassic-2	Not yet announced	2023
OpenAssistant LLaMa	30 billion	2023
Koala	13 billion	2023
Vicuna	13 billion	2023
PaLM2	Not yet announced, Smaller than PaLM1	2023
LIMA	65 billion	2023
MPT	7 billion	2023
Falcon	40 billion	2023
Llama 2	70 billion	2023
Google Gemini	Not yet announced	2023
Microsoft Phi-2	2.7 billion	2023
Grok-0	33 billion	2023
Grok-1	314 billion	2023
Solar	10.7 billion	2024
Gemma	7 billion	2024
Grok-1.5	Not yet announced	2024
DBRX	132 billion	2024
Claude 3	Not yet announced	2024
Gemma 1.1	7 billion	2024
Llama 3	70 billion	2024

⬆️ Go to top

Open models

T5 (11B) - Announced by Google / 2020
T5 FLAN (540B) - Announced by Google / 2022
T0 (11B) - Announced by BigScience (HuggingFace) / 2021
OPT-175B (175B) - Announced by Meta / 2022
UL2 (20B) - Announced by Google / 2022
Bloom (176B) - Announced by BigScience (HuggingFace) / 2022
BERT-Large (336M) - Announced by Google / 2018
GPT-NeoX 2.0 (20B) - Announced by EleutherAI / 2023
GPT-J (6B) - Announced by EleutherAI / 2021
Macaw (11B) - Announced by AI2 / 2021
Stanford Alpaca (7B) - Announced by Stanford University / 2023

⬆️ Go to top

Projects

Visual ChatGPT - Announced by Microsoft / 2023
LMOps - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities.

⬆️ Go to top

Commercial models

GPT

GPT 4 (Parameter size unannounced, gpt-4-32k) - Announced by OpenAI / 2023
ChatGPT (175B) - Announced by OpenAI / 2022
ChatGPT Plus (175B) - Announced by OpenAI / 2023
GPT 3.5 (175B, text-davinci-003) - Announced by OpenAI / 2022

⬆️ Go to top

Gemini

Gemini - Announced by Google Deepmind / 2023

Bard

Bard - Announced by Google / 2023

⬆️ Go to top

Codex

Codex (11B) - Announced by OpenAI / 2021

⬆️ Go to top

Datasets

Sphere - Announced by Meta / 2022
- 134M documents split into 906M passages as the web corpus.
Common Crawl
- 3.15B pages and over than 380TiB size dataset, public, free to use.
SQuAD 2.0
- 100,000+ question dataset for QA.
Pile
- 825 GiB diverse, open source language modelling data set.
RACE
- A large-scale reading comprehension dataset with more than 28,000 passages and nearly 100,000 questions.
Wikipedia
- Wikipedia dataset containing cleaned articles of all languages.

⬆️ Go to top

Benchmarks

BIG-bench

⬆️ Go to top

Materials

Papers

Megatron-Turing NLG (530B) - Announced by NVIDIA and Microsoft / 2021
LaMDA (137B) - Announced by Google / 2021
GLaM (1.2T) - Announced by Google / 2021
PaLM (540B) - Announced by Google / 2022
AlphaCode (41.4B) - Announced by DeepMind / 2022
Chinchilla (70B) - Announced by DeepMind / 2022
Sparrow (70B) - Announced by DeepMind / 2022
NLLB (54.5B) - Announced by Meta / 2022
LLaMA (65B) - Announced by Meta / 2023
AlexaTM (20B) - Announced by Amazon / 2022
Gopher (280B) - Announced by DeepMind / 2021
Galactica (120B) - Announced by Meta / 2022
PaLM2 Tech Report - Announced by Google / 2023
LIMA - Announced by Meta / 2023

Posts

Llama 2 (70B) - Announced by Meta / 2023
Luminous (13B) - Announced by Aleph Alpha / 2021
Turing NLG (17B) - Announced by Microsoft / 2020
Claude (52B) - Announced by Anthropic / 2021
Minerva (Parameter size unannounced) - Announced by Google / 2022
BloombergGPT (50B) - Announced by Bloomberg / 2023
AlexaTM (20B - Announced by Amazon / 2023
Dolly (6B) - Announced by Databricks / 2023
Jurassic-1 - Announced by AI21 / 2022
Jurassic-2 - Announced by AI21 / 2023
Koala - Announced by Berkeley Artificial Intelligence Research(BAIR) / 2023
Gemma - Gemma: Introducing new state-of-the-art open models / 2024
Grok-1 - Open Release of Grok-1 / 2023
Grok-1.5 - Announced by XAI / 2024
DBRX - Announced by Databricks / 2024

⬆️ Go to top

Projects

BigScience - Maintained by HuggingFace (Twitter) (Notion)
HuggingChat - Maintained by HuggingFace / 2023
OpenAssistant - Maintained by Open Assistant / 2023
StableLM - Maintained by Stability AI / 2023
Eleuther AI Language Model- Maintained by Eleuther AI / 2023
Falcon LLM - Maintained by Technology Innovation Institute / 2023
Gemma - Maintained by Google / 2024

GitHub repositories

Stanford Alpaca - - A repository of Stanford Alpaca project, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations.
Dolly - - A large language model trained on the Databricks Machine Learning Platform.
AutoGPT - - An experimental open-source attempt to make GPT-4 fully autonomous.
dalai - - The cli tool to run LLaMA on the local machine.
LLaMA-Adapter - - Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters.
alpaca-lora - - Instruct-tune LLaMA on consumer hardware.
llama_index - - A project that provides a central interface to connect your LLM's with external data.
openai/evals - - A curated list of reinforcement learning with human feedback resources.
trlx - - A repo for distributed training of language models with Reinforcement Learning via Human Feedback. (RLHF)
pythia - - A suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters.
Embedchain - - Framework to create ChatGPT like bots over your dataset.

⬆️ Go to top

HuggingFace repositories

OpenAssistant SFT 6 - 30 billion LLaMa-based model made by HuggingFace for the chatting conversation.
Vicuna Delta v0 - An open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
MPT 7B - A decoder-style transformer pre-trained from scratch on 1T tokens of English text and code. This model was trained by MosaicML.
Falcon 7B - A 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora.

⬆️ Go to top

Reading materials

⬆️ Go to top

Contributing

We welcome contributions to the Awesome LLMOps list! If you'd like to suggest an addition or make a correction, please follow these guidelines:

Fork the repository and create a new branch for your contribution.
Make your changes to the README.md file.
Ensure that your contribution is relevant to the topic of LLM.
Use the following format to add your contribution:

[Name of Resource](Link to Resource) - Description of resource

Add your contribution in alphabetical order within its category.
Make sure that your contribution is not already listed.
Provide a brief description of the resource and explain why it is relevant to LLM.
Create a pull request with a clear title and description of your changes.

We appreciate your contributions and thank you for helping to make the Awesome LLM list even more awesome!

⬆️ Go to top

KennethanCeyer/awesome-llm

Awesome LLM

Contents

Models

Overview

Open models

Projects

Commercial models

GPT

Gemini

Bard

Codex

Datasets

Benchmarks

Materials

Papers

Posts

Projects

GitHub repositories

HuggingFace repositories

Reading materials

Contributing