TAEYOUNG-SYG's Stars
krahets/hello-algo
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
state-spaces/mamba
Mamba SSM architecture
apple/ml-ferret
WooooDyy/LLM-Agent-Paper-List
The paper list of the 86-page paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
rom1504/img2dataset
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
hustvl/Vim
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
MzeroMiko/VMamba
VMamba: Visual State Space Models,code is based on mamba
ContinualAI/avalanche
Avalanche: an End-to-End Library for Continual Learning based on PyTorch.
apple/ml-aim
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
redotvideo/mamba-chat
Mamba-Chat: A chat LLM based on the state-space model architecture 🐍
pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
eric-ai-lab/MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
PKU-YuanGroup/LanguageBind
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
dvlab-research/LLaMA-VID
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
GaryYufei/AlignLLMHumanSurvey
Aligning Large Language Models with Human: A Survey
OpenRobotLab/PointLLM
[ECCV 2024 Best Paper Candidate] PointLLM: Empowering Large Language Models to Understand Point Clouds
hzxie/CityDreamer
The official implementation of "CityDreamer: Compositional Generative Model of Unbounded 3D Cities". (Xie et al., CVPR 2024)
lupantech/ScienceQA
Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
allenai/unified-io-2
llava-rlhf/LLaVA-RLHF
Aligning LMMs with Factually Augmented RLHF
GigaAI-research/General-World-Models-Survey
tsb0601/MMVP
QingruZhang/AdaLoRA
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).
FuxiaoLiu/LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
YifanXu74/Libra
Simple PyTorch implementation of "Libra: Building Decoupled Vision System on Large Language Models" (accepted by ICML 2024)
BAAI-DCAI/DataOptim
A collection of visual instruction tuning datasets.
THUNLP-MT/ModelCompose
Official code for our paper "Model Composition for Multimodal Large Language Models"