ustcfd's Stars
BlinkDL/RWKV-LM
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
01-ai/Yi
A series of large language models trained from scratch by developers @01-ai
deepseek-ai/DeepSeek-VL
DeepSeek-VL: Towards Real-World Vision-Language Understanding
microsoft/DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Yuliang-Liu/Monkey
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
ytongbai/LVM
KimMeen/Time-LLM
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
ActiveVisionLab/Awesome-LLM-3D
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
stanfordnlp/pyreft
ReFT: Representation Finetuning for Language Models
eric-ai-lab/MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
haoliuhl/ringattention
Transformers with Arbitrarily Large Context
google-deepmind/recurrentgemma
Open weights language model from Google DeepMind, based on Griffin.
NVIDIA/NeMo-Aligner
Scalable toolkit for efficient model alignment
sail-sg/lorahub
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
csuhan/OneLLM
[CVPR 2024] OneLLM: One Framework to Align All Modalities with Language
Leeroo-AI/mergoo
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
thunlp/LLaVA-UHD
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
GraphPKU/PiSSA
PiSSA: Principal Singular Values and Singular Vectors Adaptation of Large Language Models(NeurIPS 2024 Spotlight)
FusionBrainLab/OmniFusion
OmniFusion — a multimodal model to communicate using text and images
yfeng95/PoseGPT
LingyvKong/OneChart
[ACM'MM 2024 Oral] Official code for "OneChart: Purify the Chart Structural Extraction via One Auxiliary Token"
YuchenLiu98/COMM
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
zamling/PSALM
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
HuggingAGI/HuggingArxiv
taishan1994/Llama3.1-Finetuning
对llama3进行全参微调、lora微调以及qlora微调。
uukuguy/multi_loras
Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answer based on user queries.
Ivan-Tang-3D/Any2Point
[ECCV2024] Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding
BAAI-DCAI/DataOptim
A collection of visual instruction tuning datasets.
Suikasxt/PMG
The repository of paper Personalized Multimodal Response Generation with Large Language Models
AdityaNG/DriveLLaVA
Training LLaVA on the CommaVQ dataset to produce tokenized control signals for driving