/nlvr

Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.

Primary LanguageHTML

Natural Language for Visual Reasoning

This repository contains data for NLVR (Suhr et al. 2017) and NLVR2 (Suhr and Zhou et al. 2018).

The Natural Language for Visual Reasoning corpora use the task of determining whether a sentence is true about a visual input, like an image. This task focuses on reasoning about sets of objects, comparisons, and spatial relations. This includes two datasets: NLVR, with synthetically generated images, and NLVR2, which includes natural photographs.

See the webpage for examples and the leaderboards here: http://lic.nlp.cornell.edu/nlvr/

If you have questions, please use the Issues page, or email us directly: nlvr@googlegroups.com

Licensing

NLVR (original dataset with synthetically generated images; Suhr et al. 2017)

Following Microsoft COCO (http://cocodataset.org/#termsofuse), we have licensed the NLVR dataset (synthetically-generated images, structured representations, and annotations) under CC-BY-4.0 (https://creativecommons.org/licenses/by/4.0/).

NLVR2 (dataset with real images, Suhr and Zhou et al. 2018)

We have licensed the annotations of the NLVR2 images (sentences and binary labels) under CC-BY-4.0 (https://creativecommons.org/licenses/by/4.0/). We do not license the NLVR2 images as we do not hold the copyright to them.