foundation-models
There are 158 repositories under foundation-models topic.
hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Luodian/Otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
CLUEbenchmark/SuperCLUE
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
baaivision/EVA
EVA Series: Visual Representation Fantasies from BAAI
amazon-science/chronos-forecasting
Chronos: Pretrained (Language) Models for Probabilistic Time Series Forecasting
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
autodistill/autodistill
Images to inference with no labeling (use foundation models to train supervised models).
baaivision/Emu
Emu Series: Generative Multimodal Models from BAAI
tatsu-lab/alpaca_eval
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
hyp1231/awesome-llm-powered-agent
Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
time-series-foundation-models/lag-llama
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
OpenGVLab/InternVideo
Video Foundation Models & Data for Multimodal Understanding
OFA-Sys/ONE-PEACE
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
mlmed/torchxrayvision
TorchXRayVision: A library of chest X-ray datasets and models. Classifiers, segmentation, and autoencoders.
HazyResearch/meerkat
Creative interactive views of any dataset.
llm-jp/awesome-japanese-llm
日本語LLMまとめ - Overview of Japanese LLMs
qingsongedu/Awesome-TimeSeries-SpatioTemporal-LM-LLM
A professional list on Large (Language) Models and Foundation Models (LLM, LM, FM) for Time Series, Spatiotemporal, and Event Data.
MrGiovanni/ModelsGenesis
[MICCAI 2019] [MEDIA 2020] Models Genesis
NVlabs/FasterViT
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
zjunlp/KnowledgeEditingPapers
Must-read Papers on Knowledge Editing for Large Language Models.
uncbiag/Awesome-Foundation-Models
A curated list of foundation models for vision and language tasks
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
HazyResearch/hyena-dna
Official implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
NVlabs/EmerNeRF
PyTorch Implementation of EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision
baaivision/tokenize-anything
Tokenize Anything via Prompting
FoundationVision/Groma
Grounded Multimodal Large Language Model with Localized Visual Tokenization
huangwl18/VoxPoser
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
baaivision/Uni3D
[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
OpenRobotLab/PointLLM
[arXiv 2023] PointLLM: Empowering Large Language Models to Understand Point Clouds
Azure/gen-cv
Vision AI Solution Accelerator
jqin4749/MindVideo
Official code base for MinD-Video
mims-harvard/UniTS
A unified multi-task time series model.