yuminsuh's Stars
OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
EvolvingLMMs-Lab/lmms-eval
Accelerating the development of large multimodal models (LMMs) with lmms-eval
facebookresearch/unibench
Python Library to evaluate VLM models' robustness across diverse benchmarks
OpenDriveLab/OpenLane-V2
[NeurIPS 2023 Track Datasets and Benchmarks] OpenLane-V2: The First Perception and Reasoning Benchmark for Road Driving
LLaVA-VL/LLaVA-NeXT
OpenDriveLab/DriveLM
[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering
mbzuai-oryx/groundingLMM
[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
mlfoundations/MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
zkkli/I-ViT
[ICCV 2023] I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference
autodistill/autodistill
Images to inference with no labeling (use foundation models to train supervised models).
facebookresearch/paco
This repo contains documentation and code needed to use PACO dataset: data loaders and training and evaluation scripts for objects, parts, and attributes prediction models, query evaluation scripts, and visualization notebooks.
huaaaliu/RGBX_Semantic_Segmentation
carla-simulator/carla
Open-source simulator for autonomous driving research.
karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
karpathy/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
chenxin-dlut/TransT
Transformer Tracking (CVPR2021)
OpenNMT/OpenNMT-py
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
google-research/l2p
Learning to Prompt (L2P) for Continual Learning @ CVPR22 and DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning @ ECCV22
petrgeorgievsky/gtaRenderHook
GTA SA rendering hook
ifzhang/ByteTrack
[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box
nwojke/deep_sort
Simple Online Realtime Tracking with a Deep Association Metric
YiwuZhong/SGG_from_NLS
[ICCV 2021] Official code for "Learning to Generate Scene Graph from Natural Language Supervision"
ChenRocks/UNITER
Research code for ECCV 2020 paper "UNITER: UNiversal Image-TExt Representation Learning"
microsoft/GLIP
Grounded Language-Image Pre-training
voxel51/fiftyone
Refine high-quality datasets and visual AI models
davidtvs/PyTorch-ENet
PyTorch implementation of ENet
Lightning-Universe/lightning-bolts
Toolbox of models, callbacks, and datasets for AI/ML researchers.
StanfordVL/taskonomy
Taskonomy: Disentangling Task Transfer Learning [Best Paper, CVPR2018]
Lightning-AI/pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
krasserm/perceiver-io
A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training