/img-cap-metrics-robustness

An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics

An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics

This repository contains code for the paper `An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics'.

Overview

Recently, reference-free metrics such as CLIPScore (Hessel et al., 2021) and UMIC (Lee et al., 2021) have been proposed for automatic evaluation of image captions, demonstrating a high correlation with human judgment. We provide insights into the strengths and limitations of reference-free metrics for image captioning evaluation, guiding future improvements in this area.

Contents

  • Dataset: Download the dataset from here. Additionally, we have provided the file containing scores for all baselines for each metric.

  • Code: The code for our study will be released soon.