AberHu's Stars
QwenLM/Qwen2.5
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
facebookresearch/xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
HumanAIGC/EMO
Emote Portrait Alive: Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
LargeWorldModel/LWM
Large World Model -- Modeling Text and Video with Millions Context
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
THUDM/GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
dvlab-research/LongLoRA
Code and documents of LongLoRA and LongAlpaca (ICLR 2024 Oral)
yunlong10/Awesome-LLMs-for-Video-Understanding
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
gkamradt/LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
jiaweizzhao/GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
open-compass/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
mit-han-lab/smoothquant
[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
lmmlzn/Awesome-LLMs-Datasets
Summarize existing representative LLMs text datasets.
Xnhyacinth/Awesome-LLM-Long-Context-Modeling
📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥
XueFuzhao/awesome-mixture-of-experts
A collection of AWESOME things about mixture-of-experts
VITA-MLLM/VITA
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
NVIDIA/NeMo-Curator
Scalable data pre processing and curation toolkit for LLMs
marlaman/show-me
A visual and transparent alternative to open-source ChatGPT O1
shenyunhang/APE
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
BradyFU/Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
facebookresearch/open-eqa
OpenEQA Embodied Question Answering in the Era of Foundation Models
isXinLiu/Awesome-MLLM-Safety
Accepted by IJCAI-24 Survey Track
nbasyl/DoRA
Official implementation of "DoRA: Weight-Decomposed Low-Rank Adaptation"
zeyofu/BLINK_Benchmark
This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.org/abs/2404.12390 [ECCV 2024]
OpenGVLab/MM-NIAH
[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents.
Wang-ML-Lab/multimodal-needle-in-a-haystack
Code and data for the benchmark "Multimodal Needle in a Haystack (MMNeedle): Benchmarking Long-Context Capability of Multimodal Large Language Models"
visual-haystacks/vhs_benchmark
🔥 Official Benchmark Toolkits for "Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark"