ustcfd's Stars
OpenMOSS/AnyGPT
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
CasualGANPapers/Make-A-Scene
Pytorch implementation of Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
luogen1996/LaVIN
[NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"
google-research/big_vision
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Tencent/HunyuanDiT
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
hkust-nlp/deita
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
NVlabs/RADIO
Official repository for "AM-RADIO: Reduce All Domains Into One"
jiaweizzhao/GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
microsoft/unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
OpenGVLab/LLaMA-Adapter
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DefTruth/Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
X-PLUG/mPLUG-DocOwl
mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
joez17/ChatBridge
ChatBridge, an approach to learning a unified multimodal model to interpret, correlate, and reason about various modalities without relying on all combinations of paired data.
saidwivedi/TokenHMR
[CVPR 2024] TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
mingyuan-zhang/LMM
Large Motion Model for Unified Multi-Modal Motion Generation
NVlabs/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
CrazyBoyM/llama3-Chinese-chat
Llama3、Llama3.1 中文仓库(随书籍撰写中... 各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、评测、部署教程视频 & 文档)
HyperGAI/HPT
HPT - Open Multimodal LLMs from HyperGAI
treeaaa/fine_tune_llava1.6_copy
TIGER-AI-Lab/Mantis
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
shibing624/MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
magic-research/PLLaVA
Official repository for the paper PLLaVA
TencentARC/LLaMA-Pro
[ACL 2024] Progressive LLaMA with Block Expansion.
linjh1118/Llama3-Chinese-ORPO
基于Llama3,通过进一步CPT,SFT,ORPO得到的中文版Llama3
CrazyBoyM/phi3-Chinese
Phi3 中文仓库
AILab-CVC/SEED-X
Multimodal Models in Real World
EricGuo5513/momask-codes
Official implementation of "MoMask: Generative Masked Modeling of 3D Human Motions (CVPR2024)"
mbzuai-oryx/LLaVA-pp
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)