dylanqyuan

PhD student, Xidian University.

Xidian UniversityShaanxi, China.

Pinned Repositories

TinyGPT-V
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Language:Python1.3k 19 3676
ClipCropping
2 1 00
dylanqyuan
0 1 00
VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks
Language:Python0 0 00
MiniGPT-5
Official implementation of paper "MiniGPT-5: Interleaved Vision-and-Language Generation via Generative Vokens"
Language:Python858 13 4552
VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
Language:Python1.6k 11 254221
Q-Instruct
②[CVPR 2024] Low-level visual instruction tuning, with a 200K dataset and a model zoo for fine-tuned checkpoints.
Language:Python207 2 2810
LLaVAR
Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"
Language:Python261 5 2113
AesBench
An expert benchmark aiming to comprehensively evaluate the aesthetic perception capacities of MLLMs.
Language:Python219 4 76
AesExpert
[ACMMM 2024] AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception
67 3 24

dylanqyuan's Repositories

dylanqyuan/ClipCropping
2 1 00
dylanqyuan/dylanqyuan
0 1 00
dylanqyuan/VLMEvalKit
Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 50+ HF models, 20+ benchmarks
Language:Python0 0 00