zhanghaonan777's Stars
xai-org/grok-1
Grok open release
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
microsoft/autogen
A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
camel-ai/camel
🐫 CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society (NeruIPS'2023) https://www.camel-ai.org
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
1rgs/jsonformer
A Bulletproof Way to Generate Structured JSON from Language Models
tyxsspa/AnyText
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
xlang-ai/OpenAgents
OpenAgents: An Open Platform for Language Agents in the Wild
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
PKU-YuanGroup/Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
OpenGVLab/VideoMamba
VideoMamba: State Space Model for Efficient Video Understanding
triton-inference-server/tensorrtllm_backend
The Triton TensorRT-LLM Backend
jiasenlu/vilbert_beta
lichao-sun/SoraReview
The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".
Yangyi-Chen/Multimodal-AND-Large-Language-Models
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
airaria/Visual-Chinese-LLaMA-Alpaca
多模态中文LLaMA&Alpaca大语言模型(VisualCLA)
Yutong-Zhou-cv/Awesome-Multimodality
A Survey on multimodal learning research.
ninehills/llm-inference-benchmark
LLM Inference benchmark
lzw-lzw/GroundingGPT
[ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model
boheumd/MA-LMM
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
NiuTrans/ABigSurveyOfLLMs
A collection of 150+ surveys on LLMs
wangyuxinwhy/generate
A Python Package to Access World-Class Generative Models
sunanhe/MKT
Official implementation of "Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer".
pengfei-luo/multimodal-knowledge-graph
A collection of resources on multimodal knowledge graph, including datasets, papers and contests.
TencentARC-QQ/TagGPT
TagGPT: Large Language Models are Zero-shot Multimodal Taggers
chenjiashuo123/TAAC-2021-Task2-Rank6
2021 腾讯广告赛算法大赛 赛道二 决赛第六名
xubodhu/RDS