Pinned Repositories
Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
3DRefTR
This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"
PointMetaBase
This is a PyTorch implementation of PointMetaBase proposed by our paper "Meta Architecure for Point Cloud Analysis"
Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
LLaVA-NeXT
MinkowskiEngine
Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors
nccl
Optimized primitives for collective multi-GPU communication
spconv
Spatial Sparse Convolution Library
video-mme.github.io
Leon1207's Repositories
Leon1207/3DRefTR
This is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"
Leon1207/PointMetaBase
This is a PyTorch implementation of PointMetaBase proposed by our paper "Meta Architecure for Point Cloud Analysis"
Leon1207/Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis