NROwind

NROwind's Stars

haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Language:Python20.3k 156 1.5k2.2k
meta-llama/llama-recipes
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama for WhatsApp & Messenger.
Language:Jupyter Notebook15.2k 193 3792.2k
liguodongiot/llm-action
本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）
Language:HTML10.7k 88 221.1k
salesforce/LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
Language:Jupyter Notebook9.9k 99 667973
InternLM/InternLM
Official release of InternLM2.5 base and chat models. 1M context support
Language:Python6.5k 58 333456
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python6k 53 608466
salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Language:Jupyter Notebook4.8k 34 198643
open-compass/opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Language:Python4.1k 26 550437
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
Language:Python4k 34 532310
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
Language:Python3.7k 48 175284
315386775/DeepLearing-Interview-Awesome-2024
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓，同时包含工作和科研过程中的新想法、新问题、新资源与新项目
1.7k 27 1174
salesforce/ALBEF
Code for ALBEF: a new vision-language pre-training method
Language:Python1.6k 13 141198
datawhalechina/tiny-universe
《大模型白盒子构建指南》：一个全手搓的Tiny-Universe
Language:Python1.4k 13 13131
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
Language:Python1.3k 10 209188
multimodal-art-projection/MAP-NEO
Language:Python877 11 3481
mindspore-courses/step_into_llm
MindSpore online courses: Step into LLM
Language:Jupyter Notebook429 9 2996
Paranioar/Awesome_Matching_Pretraining_Transfering
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
399 11 547
TIGER-AI-Lab/Mantis
Official code for Paper "Mantis: Multi-Image Instruction Tuning" (TMLR2024)
Language:Python184 9 2015
yfzhang114/Awesome-Multimodal-Large-Language-Models
Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models
145 3 26
daixiangzi/Awesome-Token-Compress
A paper list of some recent works about Token Compress for Vit and VLM
139 5 34
alipay/Ant-Multi-Modal-Framework
Research Code for Multimodal-Cognition Team in Ant Group
Language:Python122 4 205
liguopeng0923/UCVGL
[CVPR 2024🔥] Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization
Language:Python91 2 10
OpenGVLab/LCL
Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning
Language:Python66 1 43
PromptExpert/blogs
501
zhourax/VEGA
Language:Python33 1 42
fawazsammani/awesome-vision-language-pretraining
Awesome Vision-Language Pretraining Papers
29 3 13
FeipengMa6/VLoRA
[NeurIPS 2024] Visual Perception by Large Language Model’s Weights
Language:Python28 3 01
HKUST-LongGroup/CoMM
Official repository for CoMM Dataset
Language:Python24 3 10
Zi-hao-Wei/Efficient-Vision-Language-Pre-training-by-Cluster-Masking
[CVPR 2024] Improving language-visual pretraining efficiency by perform cluster-based masking on images.
Language:Python22 2 10
xihuai18/arxiv-sanity-x
Language:Python17 2 22