seanzhuh's Stars
VITA-MLLM/VITA
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
yixuan730/DetToolChain
Dettoolchain: A new prompting paradigm to unleash detection ability of MLLM
scratchapixel/scratchapixel-code
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
slothfulxtx/Texture-GS
[ECCV 2024] The official repo for "Texture-GS: Disentangling the Geometry and Texture for 3D Gaussian Splatting Editing"
ruiqixu37/Nuvo
Personal Implementation of the paper: Nuvo: Neural UV Mapping for Unruly 3D Representations
HKUST-LongGroup/Awesome-Open-Vocabulary-Detection-and-Segmentation
Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
BradyFU/Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
FoundationVision/VAR
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
Rubics-Xuan/MRES
This repo holds the official code and data for "Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation", accepted by CVPR 2024.
baaivision/tokenize-anything
[ECCV 2024] Tokenize Anything via Prompting
V3Det/V3Det
shenyunhang/APE
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
bytedance/OmniScient-Model
This repo contains the code for our paper Towards Open-Ended Visual Recognition with Large Language Model
MinaGhadimiAtigh/hyperbolic_representation_learning
The repository for Hyperbolic Representation Learning for Computer Vision, ECCV 2022
valeoai/Awesome-Unsupervised-Object-Localization
Curated list of awesome works on unsupervised object localization in 2D images.
microsoft/SoM
Set-of-Mark Prompting for GPT-4V and LMMs
dome272/Diffusion-Models-pytorch
Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)
apple/ml-ferret
baaivision/Uni3D
[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
Paranioar/UniPT
[CVPR2024] The code of "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory"
prannaykaul/mm-ovod
Official repo for our ICML 23 paper: "Multi-Modal Classifiers for Open-Vocabulary Object Detection"
mlzxy/devit
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
OpenGVLab/VisionLLM
VisionLLM Series
xmed-lab/CLIP_Surgery
CLIP Surgery for Better Explainability with Enhancement in Open-Vocabulary Tasks
witnessai/Awesome-Open-Vocabulary-Object-Detection
A curated list of papers, datasets and resources pertaining to open vocabulary object detection.
mlfoundations/open_clip
An open source implementation of CLIP.
open-mmlab/playground
A central hub for gathering and showcasing amazing projects that extend OpenMMLab with SAM and other exciting features.
CamuseCao/XMU-thesis
A LaTeX template