This is a repository for the paper V-FLUTE: Visual Figurative Language Understanding with Textual Explanations
The dataset is available on HuggingFace.
You can reproduce fine-tuned the models using the scripts in LLaVA/scripts/vflute and hyperparameters in the paper.
Our best model is available on HuggingFace here: LLaVA-1.5-7b-eViL-VFLUTE-lora
See the eval folder for scripts to compute F1@ExplanationScore and and run inference on the test set.
@misc{saakyan2024vflute,
title={V-FLUTE: Visual Figurative Language Understanding with Textual Explanations},
author={Arkadiy Saakyan and Shreyas Kulkarni and Tuhin Chakrabarty and Smaranda Muresan},
year={2024},
eprint={2405.01474},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
For any questions or additional information, please raise a github issue or contact Arkadiy Saakyan.