FuxiaoLiu
Hi! I'm a 3rd-year CS Ph.D at University of Maryland, College Park, working with Abhinav Shrivastava and Yaser Yacoob.
Pinned Repositories
awesome-Large-MultiModal-Hallucination
😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.
DocumentCLIP
[ICPRAI 2024] DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents
HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Large-Multimodal-Hallucination
LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
MMC
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
Twitter-Video-dataset
[EACL'23] COVID-VTS: Fact Extraction and Verification on Short Video Platforms
VisualNews-Repository
[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning
EAGLE
Eagle Family: Exploring Model Designs, Data Recipes and Training Strategies for Frontier-Class Multimodal LLMs
HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
FuxiaoLiu's Repositories
FuxiaoLiu/LRV-Instruction
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
FuxiaoLiu/MMC
[NAACL 2024] MMC: Advancing Multimodal Chart Understanding with LLM Instruction Tuning
FuxiaoLiu/VisualNews-Repository
[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning
FuxiaoLiu/DocumentCLIP
[ICPRAI 2024] DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents
FuxiaoLiu/Twitter-Video-dataset
[EACL'23] COVID-VTS: Fact Extraction and Verification on Short Video Platforms
FuxiaoLiu/HallusionBench
[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
FuxiaoLiu/Large-Multimodal-Hallucination
FuxiaoLiu/awesome-Large-MultiModal-Hallucination
😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.
FuxiaoLiu/EAGLE
EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
FuxiaoLiu/Awesome-Multimodal-Large-Language-Models
:sparkles::sparkles:Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.
FuxiaoLiu/calvinliu123
FuxiaoLiu/calvinliu123.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
FuxiaoLiu/Classproject_VIL
FuxiaoLiu/CMSC722_project
FuxiaoLiu/fuxiaoliu.github.io
FuxiaoLiu/GoodNews
Good News Everyone! - CVPR 2019
FuxiaoLiu/LLaVA
Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
FuxiaoLiu/LRV
FuxiaoLiu/M3Exam
Data and code for paper "M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models"
FuxiaoLiu/MiniGPT-4
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models
FuxiaoLiu/mplug_implementation_fl
FuxiaoLiu/open_clip
An open source implementation of CLIP.
FuxiaoLiu/Recommendation-System
FuxiaoLiu/Role-Embedding
FuxiaoLiu/SAT
FuxiaoLiu/self-instruct
Aligning pretrained language models with instruction data generated by themselves.
FuxiaoLiu/TCP
[NeurIPS 2022] Trajectory-guided Control Prediction for End-to-end Autonomous Driving: A Simple yet Strong Baseline.
FuxiaoLiu/tool4ipp
This repository contains a data conversion tool for Image Position Prediction task proposed in our paper
FuxiaoLiu/Twitter-COMMs
FuxiaoLiu/VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)