royzhang12

Pinned Repositories

Video-MME
✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
407 5 3212
LLaVA-NeXT
Language:Python2.9k 34 304250
VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
Language:Python1.4k 10 209194
InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Language:Python6.1k 53 623474
HieraSeg
CVPR2022 - Deep Hierarchical Semantic Segmentation - A structured, pixel-wise description of visual scenes in terms of the class hierarchy.
Language:Python1 1 00
Target-referenced-Reactive-Grasping-for-Dynamic-Objects
Language:Python9 0 00

royzhang12's Repositories

royzhang12/Target-referenced-Reactive-Grasping-for-Dynamic-Objects
Language:Python9 0 00
royzhang12/HieraSeg
CVPR2022 - Deep Hierarchical Semantic Segmentation - A structured, pixel-wise description of visual scenes in terms of the class hierarchy.
Language:Python1 1 00