foundation-models

There are 158 repositories under foundation-models topic.

hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
Language:Python38.1k 379 1.6k4.3k
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Language:Python18.7k 293 1.3k2.4k
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python17.1k 153 1.3k1.8k
Luodian/Otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Language:Python3.5k 100 159239
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Language:Python2.9k 60 86305
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
Language:Python2.8k 37 167223
CLUEbenchmark/SuperCLUE
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
2.7k 35 4689
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
Language:Python2k 31 147146
amazon-science/chronos-forecasting
Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
Language:Python1.9k 21 32221
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Language:Python1.8k 17 39171
autodistill/autodistill
Images to inference with no labeling (use foundation models to train supervised models).
Language:Python1.6k 19 88125
baaivision/Emu
Emu Series: Generative Multimodal Models from BAAI
Language:Python1.5k 21 8379
tatsu-lab/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Language:Jupyter Notebook1.2k 9 108174
hyp1231/awesome-llm-powered-agent
Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
1.1k 38 379
time-series-foundation-models/lag-llama
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
Language:Python1k 30 67104
OpenGVLab/InternVideo
Video Foundation Models & Data for Multimodal Understanding
Language:Python1k 28 11669
OFA-Sys/ONE-PEACE
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Language:Python857 12 5052
mlmed/torchxrayvision
TorchXRayVision: A library of chest X-ray datasets and models. Classifiers, segmentation, and autoencoders.
Language:Jupyter Notebook841 18 73207
HazyResearch/meerkat
Creative interactive views of any dataset.
Language:Python815 15 8342
llm-jp/awesome-japanese-llm
日本語LLMまとめ - Overview of Japanese LLMs
798 19 15021
qingsongedu/Awesome-TimeSeries-SpatioTemporal-LM-LLM
A professional list on Large (Language) Models and Foundation Models (LLM, LM, FM) for Time Series, Spatiotemporal, and Event Data.
781 26 253
MrGiovanni/ModelsGenesis
[MICCAI 2019] [MEDIA 2020] Models Genesis
Language:Jupyter Notebook723 15 58139
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
Language:Python703 17 4654
zjunlp/KnowledgeEditingPapers
Must-read Papers on Knowledge Editing for Large Language Models.
638 22 441
uncbiag/Awesome-Foundation-Models
A curated list of foundation models for vision and language tasks
603 31 027
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
Language:Python597 30 5028
HazyResearch/hyena-dna
Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
Language:Assembly513 22 5770
NVlabs/EmerNeRF
PyTorch Implementation of EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision
Language:Python497 26 2435
baaivision/tokenize-anything
Tokenize Anything via Prompting
Language:Jupyter Notebook454 5 1516
FoundationVision/Groma
Grounded Multimodal Large Language Model with Localized Visual Tokenization
Language:Python423 36 654
huangwl18/VoxPoser
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
Language:Python410 8 1752
baaivision/Uni3D
[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
Language:Python408 12 1923
OpenRobotLab/PointLLM
[arXiv 2023] PointLLM: Empowering Large Language Models to Understand Point Clouds
Language:Python398 10 2518
Azure/gen-cv
Vision AI Solution Accelerator
Language:Jupyter Notebook367 15 9177
jqin4749/MindVideo
Official code base for MinD-Video
Language:Python352 15 525
mims-harvard/UniTS
A unified multi-task time series model.
Language:Python333 14 2239

foundation-models

hpcaitech/ColossalAI

microsoft/unilm

haotian-liu/LLaVA

Luodian/Otter

NExT-GPT/NExT-GPT

OpenGVLab/Ask-Anything

CLUEbenchmark/SuperCLUE

baaivision/EVA

amazon-science/chronos-forecasting

deepseek-ai/DeepSeek-VL

autodistill/autodistill

baaivision/Emu

tatsu-lab/alpaca_eval

hyp1231/awesome-llm-powered-agent

time-series-foundation-models/lag-llama

OpenGVLab/InternVideo

OFA-Sys/ONE-PEACE

mlmed/torchxrayvision

HazyResearch/meerkat

llm-jp/awesome-japanese-llm

qingsongedu/Awesome-TimeSeries-SpatioTemporal-LM-LLM

MrGiovanni/ModelsGenesis

NVlabs/FasterViT

zjunlp/KnowledgeEditingPapers

uncbiag/Awesome-Foundation-Models

mbzuai-oryx/groundingLMM

HazyResearch/hyena-dna

NVlabs/EmerNeRF

baaivision/tokenize-anything

FoundationVision/Groma

huangwl18/VoxPoser

baaivision/Uni3D

OpenRobotLab/PointLLM

Azure/gen-cv

jqin4749/MindVideo

mims-harvard/UniTS