zhanghaonan777

zhanghaonan777's Stars

xai-org/grok-1
Grok open release
Language:Python49k 557 1978.3k
huggingface/pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Language:Python30.4k 311 8764.6k
microsoft/autogen
A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap
Language:Jupyter Notebook27.1k 359 1.3k3.9k
FlagOpen/FlagEmbedding
Retrieval and Retrieval-augmented LLMs
Language:Python5.6k 33 786392
google/gemma_pytorch
The official PyTorch implementation of Google's Gemma models
Language:Python5.1k 39 34481
camel-ai/camel
🐫 CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society (NeruIPS'2023) https://www.camel-ai.org
Language:Python4.6k 55 242572
allenai/OLMo
Modeling, training, eval, and inference code for OLMo
Language:Python4.1k 41 159385
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Language:Python4.1k 46 381316
1rgs/jsonformer
A Bulletproof Way to Generate Structured JSON from Language Models
Language:Jupyter Notebook3.9k 21 39139
tyxsspa/AnyText
Official implementation code of the paper <AnyText: Multilingual Visual Text Generation And Editing>
Language:Python3.9k 53 102256
xlang-ai/OpenAgents
OpenAgents: An Open Platform for Language Agents in the Wild
Language:Python3.7k 40 97393
NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
Language:Python3k 60 87306
PKU-YuanGroup/Chat-UniVi
[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Language:Python673 7 4335
OpenGVLab/VideoMamba
VideoMamba: State Space Model for Efficient Video Understanding
Language:Python645 12 5846
triton-inference-server/tensorrtllm_backend
The Triton TensorRT-LLM Backend
Language:Python555 23 39076
jiasenlu/vilbert_beta
Language:Jupyter Notebook469 15 6294
lichao-sun/SoraReview
The official GitHub page for the review paper "Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models".
467 8 218
Yangyi-Chen/Multimodal-AND-Large-Language-Models
Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.
421 16 325
airaria/Visual-Chinese-LLaMA-Alpaca
多模态中文LLaMA&Alpaca大语言模型（VisualCLA）
Language:Python383 9 1236
Yutong-Zhou-cv/Awesome-Multimodality
A Survey on multimodal learning research.
283 11 119
ninehills/llm-inference-benchmark
LLM Inference benchmark
Language:Python265 2 211
lzw-lzw/GroundingGPT
[ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model
Language:Python246 14 1012
boheumd/MA-LMM
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Language:Python159 4 2116
NiuTrans/ABigSurveyOfLLMs
A collection of 150+ surveys on LLMs
146 5 011
wangyuxinwhy/generate
A Python Package to Access World-Class Generative Models
Language:Python122 1 415
sunanhe/MKT
Official implementation of "Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer".
Language:Python114 2 216
pengfei-luo/multimodal-knowledge-graph
A collection of resources on multimodal knowledge graph, including datasets, papers and contests.
98 4 212
TencentARC-QQ/TagGPT
TagGPT: Large Language Models are Zero-shot Multimodal Taggers
Language:Python52 6 25
chenjiashuo123/TAAC-2021-Task2-Rank6
2021 腾讯广告赛算法大赛赛道二决赛第六名
Language:Python354
xubodhu/RDS
Language:Python10 1 10