Pinned Repositories
t2v_metrics_cogvlm
Evaluating text-to-image/video/3D models with VQAScore
4D-Facial-Avatars
Dynamic Neural Radiance Fields for Monocular 4D Facial Avater Reconstruction
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
t2v_metrics
Evaluating text-to-image/video/3D models with VQAScore
Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
One2345plus
CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
aldoz-mila's Repositories
aldoz-mila/t2v_metrics_cogvlm
Evaluating text-to-image/video/3D models with VQAScore