LinMu7177

LinMu7177's Stars

Meituan-AutoML/MobileVLM
Strong and Open Vision Language Assistant for Mobile Devices
Language:Python88263
DaiShiResearch/TransNeXt
[CVPR 2024] Code release for TransNeXt model
Language:Python29012
OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
Language:Python3.9k302
snap-research/Panda-70M
[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Language:Python43815
modelscope/modelscope
ModelScope: bring the notion of Model-as-a-Service to life.
Language:Python6.5k671
NUS-HPC-AI-Lab/Neural-Network-Parameter-Diffusion
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
Language:Python78838
LargeWorldModel/LWM
Language:Python7k540
google-research/syn-rep-learn
Learning from synthetic data - code and models
Language:Python27512
lxtGH/OMG-Seg
OMG-LLaVA and OMG-Seg codebase
Language:Python95444
UX-Decoder/FIND
Language:Python945
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
Language:Jupyter Notebook8.3k701
jacobgil/pytorch-grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
Language:Python9.8k1.5k
casper9429-kth/Siamese-Masked-Autoencoders---Learning-and-Exploration
Course: DD2412 Deep Learning Advanced at KTH Project by Casper, Magnus, and Friso Focus: Self-supervised learning and computer vision with SiamMAE. Replicating core results and potential research extensions.
Language:Jupyter Notebook8
SHI-Labs/VCoder
VCoder: Versatile Vision Encoders for Multimodal Large Language Models, arXiv 2023 / CVPR 2024
Language:Python24512
lzw-lzw/GroundingGPT
[ACL 2024] GroundingGPT: Language-Enhanced Multi-modal Grounding Model
Language:Python26213
PKU-YuanGroup/LanguageBind
【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Language:Python61547
CircleRadon/Osprey
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
Language:Python71440
poloclub/cnn-explainer
Learning Convolutional Neural Networks with Interactive Visualization.
Language:JavaScript7.6k1.2k
yumingj/Text2Human
Code for Text2Human (SIGGRAPH 2022). Paper: Text2Human: Text-Driven Controllable Human Image Generation
Language:Python81387
zalandoresearch/pytorch-vq-vae
PyTorch implementation of VQ-VAE by Aäron van den Oord et al.
Language:Jupyter Notebook50998
FutureXiang/soda
Unofficial implementation of "SODA: Bottleneck Diffusion Models for Representation Learning"
Language:Jupyter Notebook663
InternLM/InternLM-XComposer
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output
Language:Python2.1k131
rosinality/vq-vae-2-pytorch
Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch
Language:Python1.6k269
eric-ai-lab/MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
Language:Python83251
X2FD/LVIS-INSTRUCT4V
127
openai/consistencydecoder
Consistency Distilled Diff VAE
Language:Python2.1k76
CompVis/stable-diffusion
A latent text-to-image diffusion model
Language:Jupyter Notebook66.6k10k
fudan-zvg/Semantic-Segment-Anything
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
Language:Python2k131
UX-Decoder/Segment-Everything-Everywhere-All-At-Once
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
Language:Python4.2k354
microsoft/X-Decoder
[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language
Language:Python1.3k122