AILab-CVC/SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
PythonNOASSERTION
Issues
- 1
Wrong Question Types in SEED-Bench1
#28 opened by littlepenguin89106 - 1
Question about evaluation input format
#27 opened by yellow-binary-tree - 1
- 1
- 1
- 2
Question on multi-image input
#24 opened by auhowielau - 0
[bugs] LLaVA-Evaluation : RuntimeError: Expected all tensors to be on the same device
#21 opened by JJJYmmm - 0
Muti-GPUs Evaluation
#20 opened by JJJYmmm - 2
Question on how task27 generates images
#19 opened by JunZhan2000 - 2
Request for the interface of minigpt4 and llava
#10 opened by Richar-Du - 6
What is the correct way to download the video
#17 opened by teasgen - 1
Easy way to probe result examples?
#11 opened by chancharikmitra - 5
a lot of data with more questions than pictures in SEED-Bench-2 level L2, is this reasonable?
#15 opened by nemonameless - 1
VLMs vs LLMs evaluation
#12 opened by idan-tankel - 2
Request for the removing duplicate results
#16 opened by khanrc - 2
Reproduce the Qwen-VL SOTAs results
#9 opened by jinze1994 - 1
In-Context Example Selection Process
#14 opened by mustafaadogan - 0
How to download the images?
#13 opened by dyahadila - 2
Support for evaluation of other VLM models like MiniGPT-4, mPLUG-Owl, Llava, and VPGTrans
#8 opened by WesleyHsieh0806 - 2
[Data] Could you provide a list including the files of something-something v2 which should be downloaded?
#6 opened by aopolin-lv - 3
- 1
Evaluating latest version of OpenFlamingo
#2 opened by anas-awadalla - 1
Update for Otter-Image-MPT7B and Otter-Video
#1 opened by Luodian - 1
Update for mPLUG-Owl
#3 opened by MAGAer13 - 1
- 1