SeuTao

AI Developer / Kaggle Grandmaster / Engineer / Data Scientist

Shanghai/Shenzhen

SeuTao's Stars

facebookresearch/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook11.1k 64 259944
xzhih/one-key-hidpi
Enable macOS HiDPI and have a native setting.
Language:Shell8.8k 99 2321k
bklieger-groq/g1
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
Language:Python3.9k 51 14350
QwenLM/Qwen2-VL
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Language:Python3k 27 365177
pytorch/torchtitan
A native PyTorch Library for large model training
Language:Python2.6k 42 172202
X-PLUG/mPLUG-Owl
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
Language:Python2.3k 30 229176
NVlabs/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Language:Python2k 27 128158
zou-group/textgrad
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.
Language:Python1.8k 21 78153
karpathy/nano-llama31
nanoGPT style version of Llama 3.1
Language:Python1.2k 23 660
chaidiscovery/chai-lab
Chai-1, SOTA model for biomolecular structure prediction
Language:Python1.2k 16 83153
showlab/Show-o
Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.
Language:Python1k 14 4244
VITA-MLLM/VITA
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
Language:Python951 39 5457
BAAI-DCAI/Bunny
A family of lightweight multimodal models.
Language:Python928 19 12069
DAMO-NLP-SG/VideoLLaMA2
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Language:Python859 10 9958
lucidrains/transfusion-pytorch
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
Language:Python692 33 1725
AIDC-AI/Ovis
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
Language:Python509 6 3429
Alpha-VLLM/Lumina-mGPT
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
Language:Python495 6 2921
bfshi/scaling_on_scales
When do we not need larger vision models?
Language:Python333 7 149
Oryx-mllm/Oryx
MLLM for On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
Language:Python286 4 1614
OpenGVLab/OmniCorpus
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Language:Python270 13 85
yuweihao/MM-Vet
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
Language:Python264 2 711
CircleRadon/TokenPacker
The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM".
Language:Python211 8 129
TIGER-AI-Lab/Mantis
Official code for Paper "Mantis: Multi-Image Instruction Tuning"
Language:Python179 9 2015
JUNJIE99/MLVU
🔥🔥MLVU: Multi-task Long Video Understanding Benchmark
Language:Python156 4 80
baaivision/DenseFusion
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Language:Python115 4 51
lqtrung1998/mwp_ReFT
Language:Python98 5 812
sail-sg/regmix
🧬 RegMix: Data Mixture as Regression for Language Model Pre-training
Language:Jupyter Notebook87 5 104
scenarios/WeMM
Language:Python86 4 711
alonj/Same-Task-More-Tokens
The code for the paper: "Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models"
Language:Jupyter Notebook49 1 13
xverse-ai/XVERSE-MoE-A36B
XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.
Language:Python36 3 02

SeuTao

SeuTao's Stars

facebookresearch/segment-anything-2

xzhih/one-key-hidpi

bklieger-groq/g1

QwenLM/Qwen2-VL

pytorch/torchtitan

X-PLUG/mPLUG-Owl

NVlabs/VILA

zou-group/textgrad

karpathy/nano-llama31

chaidiscovery/chai-lab

showlab/Show-o

VITA-MLLM/VITA

BAAI-DCAI/Bunny

DAMO-NLP-SG/VideoLLaMA2

lucidrains/transfusion-pytorch

AIDC-AI/Ovis

Alpha-VLLM/Lumina-mGPT

bfshi/scaling_on_scales

Oryx-mllm/Oryx

OpenGVLab/OmniCorpus

yuweihao/MM-Vet

CircleRadon/TokenPacker

TIGER-AI-Lab/Mantis

JUNJIE99/MLVU

baaivision/DenseFusion

lqtrung1998/mwp_ReFT

sail-sg/regmix

scenarios/WeMM

alonj/Same-Task-More-Tokens

xverse-ai/XVERSE-MoE-A36B