ustcfd

ustcfd's Stars

Beckschen/ViTamin
[CVPR 2024] Official implementation of "ViTamin: Designing Scalable Vision Models in the Vision-language Era"
Language:Python1625
SkyworkAI/Vitron
A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Language:Python28118
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
Language:Python21.6k2.1k
AdrianBZG/llama-multimodal-vqa
Multimodal Instruction Tuning for Llama 3
Language:Python325
BAAI-DCAI/Bunny
A family of lightweight multimodal models.
Language:Python87466
apple/corenet
CoreNet: A library for training deep neural networks
Language:Python6.9k535
AILab-CVC/SEED
Official implementation of SEED-LLaMA (ICLR 2024).
Language:Python55731
LlamaFamily/Llama-Chinese
Llama中文社区，Llama3在线体验和微调模型已开放，实时汇总最新Llama3学习资料，已将所有代码更新适配Llama3，构建最好的中文Llama大模型，完全开源可商用
Language:Python13.6k1.2k
taishan1994/Llama3.1-Finetuning
对llama3进行全参微调、lora微调以及qlora微调。
Language:Python11710
stanfordnlp/pyreft
ReFT: Representation Finetuning for Language Models
Language:Python1.1k90
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Language:Python2k188
microsoft/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Language:Python1.8k173
sail-sg/lorahub
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Language:Python57135
uukuguy/multi_loras
Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answer based on user queries.
Language:Python1369
Leeroo-AI/mergoo
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
Language:Python38824
thunlp/LLaVA-UHD
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
Language:Python29615
google-deepmind/recurrentgemma
Open weights language model from Google DeepMind, based on Griffin.
Language:Python59123
Suikasxt/PMG
The repository of paper Personalized Multimodal Response Generation with Large Language Models
Language:Python6
LingyvKong/OneChart
[ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"
Language:Python15112
yfeng95/PoseGPT
Language:Python20411
Ivan-Tang-3D/Any2Point
[ECCV2024] Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding
Language:Python916
NVIDIA/NeMo-Aligner
Scalable toolkit for efficient model alignment
Language:Python50855
forhaoliu/ringattention
Transformers with Arbitrarily Large Context
Language:Python61247
YuchenLiu98/COMM
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
1815
eric-ai-lab/MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
Language:Python84252
ytongbai/LVM
Language:Python1.7k54
GraphPKU/PiSSA
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models
Language:Jupyter Notebook2459
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Language:Python12.4k838
BAAI-DCAI/DataOptim
A collection of visual instruction tuning datasets.
Language:Python743
csuhan/OneLLM
[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language
Language:Python55227