Curated List of Large Language Models
Large Language Models (LLMs) are computer programs designed to understand and generate human language. They are based on deep learning techniques, which allow them to learn from vast amounts of data and make predictions based on that knowledge. The development of LLMs is one of the most exciting recent advancements in artificial intelligence and natural language processing.
LLMs are able to generate text that is often indistinguishable from text written by humans. They can be used to write news articles, generate dialogue, create captions for images, and even complete tasks like answering questions and translating languages. LLMs have a wide range of applications, from chatbots to virtual assistants, and they are becoming increasingly important in fields such as journalism, marketing, and customer service.
One of the most important aspects of LLMs is their ability to learn from large amounts of data. They are often trained on massive datasets of text, such as books, articles, and social media posts. By analyzing this data, they are able to learn the patterns and structures of language and use that knowledge to generate new text.
- Bloom: BigScience Large Open-science Open-access Multilingual Language Model
- Galactica: A Large Language Model for Science. GALACTICA is a general-purpose scientific language model. It is trained on a large corpus of scientific text and data. It can perform scientific NLP tasks at a high level, as well as tasks such as citation prediction, mathematical reasoning, molecular property prediction and protein annotation. More information is available at galactica.org.
- LLaMA: Open and Efficient Foundation Language Models
- MPT30B: Mosaic Pretrained Transformer
- OPT (Open Pre-trained Transformers)
- OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
- Pythia : a suite of 16 Large Language Models (LLMs) with varying model parameters, ranging from 70 million to 12 billion parameters. Developed by EleutherAI.
- Model parameters - 70M, 160M, 410M, 1.0B, 1.4B, 2.8B, 6.9B, 12B
- Code
- Paper
- Hugging Face
-
Alpaca: Stanford Alpaca: An Instruction-following LLaMA Model
-
Alpaca-lora:
- Training chatGPT alternative on consumer device gpus
- Code
-
🦙 ChatLLaMA: ChatLLaMA 🦙 is a library that allows you to create hyper-personalized ChatGPT-like assistants using your own data and the least amount of compute possible. Instead of depending on one large assistant that “rules us all”, we envision a future where each of us can create our own personalized version of ChatGPT-like assistants. Imagine a future where many ChatLLaMAs at the "edge" will support a variety of human's needs. But creating a personalized assistant at the "edge" requires huge optimization efforts on many fronts: dataset creation, efficient training with RLHF, and inference optimization.
- Code
- Paper: Not available
- Demo: on own model
-
Koala: A Dialogue Model for Academic Research
-
MPT-30B-Instruct: MPT-30B-Instruct is a model for short-form instruction following. It is built by finetuning MPT-30B on Dolly HHRLHF derived from the Databricks Dolly-15k and the Anthropic Helpful and Harmless (HH-RLHF) datasets. It is also trained on Competition Math, Duorc, CoT GSM8k, Qasper, Quality, Summ Screen FD and Spider.
-
Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality
-
Xturing: xturing provides fast, efficient and simple fine-tuning of LLMs, such as LLaMA, GPT-J, GPT-2, OPT, Cerebras-GPT, Galactica, and more. By providing an easy-to-use interface for personalizing LLMs to your own data and application, xTuring makes it simple to build and control LLMs. The entire process can be done inside your computer or in your private cloud, ensuring data privacy and security.
- Code
- Paper - Not available
- Demo
-
Dolly 2.0 : An open source, commercially usable ChatGPT-style AI model developed by Databricks. Trained on a high-quality human-generated instruction following dataset, crowdsourced among Databricks employees
As an open source project in a rapidly evolving field, we welcome contributions of all kinds, including new features and better documentation.
Most recent updates may be available at LLM or LLM Papers