Who are you referring to? Coreference resolution in image narrations


Arushi Goel, Basura Fernando, Frank Keller, Hakan Bilen, Who are you referring to? Coreference resolution in image narrations, ICCV 23.

The webpage for this project is available here, with a link to the paper and CIN dataset.


CIN Dataset

  1. Download the CIN dataset from here

  2. The folder contains the following files :

  • testval_annotations.json: the json file has the following structure
 "image": "image_id (flickr30k images)",
 "captions": "narration",
 "split": "test/val",
 "img_width": "int",
 "img_height": "int",
 "query": "list of phrases",
 "query_start_end": "list of start and end index for each phrase/query",
 "cluster": "list of cluster id for each phrase/query",
 "target_bboxes": "list of bounding boxes for each phrase/query (x,y,w,h)"


Please contact the first author for any queries or concerns at goel.arushi@gmail.com.