DimplesL's Stars
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
arcee-ai/mergekit
Tools for merging pretrained large language models.
YuchenLiu98/COMM
Pytorch code for paper From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models
pharmapsychotic/clip-interrogator
Image to prompt with BLIP and CLIP
InternLM/xtuner
An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
InternLM/lmdeploy
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
ollama/ollama
Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.
google-research/omniglue
Code release for CVPR'24 submission 'OmniGlue'
SHI-Labs/CuMo
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
chengtao-lv/PTQ4SAM
[CVPR 2024] PTQ4SAM: Post-Training Quantization for Segment Anything
state-spaces/mamba
Mamba SSM architecture
LlamaFamily/Llama-Chinese
Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用
Vision-CAIR/MiniGPT4-video
Official code for Goldfish model for long video understanding and MiniGPT4-video for short video understanding
luogen1996/LLaVA-HR
LLaVA-HR: High-Resolution Large Language-Vision Assistant
Yuliang-Liu/Monkey
【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
boheumd/MA-LMM
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
thunlp/LLaVA-UHD
LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images
PeterJaq/Awesome-Autonomous-Driving
karpathy/llama2.c
Inference Llama 2 in one file of pure C
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
CompVis/latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
mini-sora/minisora
MiniSora: A community aims to explore the implementation path and future development direction of Sora.
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
THUDM/CogCoM
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
tsb0601/MMVP