/Hexagons

Hexagons dataset: Processing and grounding abstraction in natural language

Primary LanguageHTMLMIT LicenseMIT

Hexagons

This repository contains the data used to train and evaluate the instruction-to-execution task derived from the Hexagons dataset. The Hexagons dataset comprises 4177 naturally-occurring visually grounded instructions rich with diverse types and levels of abstractions.
For more details about the framework of the Hexagons dataset as well as the Hexagons App & Game, the abstraction elicitation methodology, dataset and baseline models please refer to our paper and website.

Dataset

Download

Dataset Description

For details about the data please refer to the dataset section in our website and to our paper.

Data Format

The dataset is split into three files of train, dev and test with with an 80/10/10 ratio of the drawing procedures (For more details on the split rationale refer to our our paper). Each file is in a jsonl format where each entry is a dictionary of a single drawing procedure. A drawing procedure consists of:

  • index: a unique identifier of a drawing procedure marked as a number between 0 and 619 such that the numbers run across train/dev/test in a random order. This index corresponds to the dataset visualization index, in order to allow users to visualize the drawing procedure by its index.

  • annotation_round: a unique identifier for each round where we collected drawing procedures. The values are:

    • '1': The first round where we used images designed by the project team.
    • '2': The second round where we used images designed by experienced crowdworkers.
    • '0': The preliminary rounds (results from the Pilot and Recruitment phases).

    The split of the drawing procedure indices by the rounds is as follow:

    • The first round consists of indices: 0-302, 492.
    • The second round consists of indices: 303-491, 493-496.
    • The prelminary round consists of indices: 497-619.
  • category: The abstraction mechanism which the target image of a drawing procedure is intended to trigger. This is not applicable for the second round. The categories consist of: 'simple', 'bounded iteration', 'conditional iteration', 'conditions', 'recursion', 'symmetry' and 'composed objects' and 'other'. The value 'NONE' refers to drawing procedures from the second round, indicating that no category is associated with that image.

  • image_id: a unique identifier of an image (task) that is used as a target image for which a drawing procedure is written. The image index consists of 9-10 characters involving numbers and capital letters (e.g., 'P01C02T03')

  • instructor_id: a unique identifier of an Instructor marked as a number between 1 and 38. 24 Instructors participated in the first and second rounds, and 14 more in the preliminary rounds.

  • number_of_drawing_steps: the number of drawing steps comprising the drawing procedure (i.e., the length of the drawing procedure).

  • agreement_tags: a list of tags in the same length of the drawing procedure that indicates the level of agreement between the Instructor and the two Verifiers for each drawing step (find details in our paper).
    The level of agreement is calculated by the Board-Based Exact Match metric, that is, two executions of the instructions agree if and only if the denotations (i.e., the resulting board states on the Hexagons board) are identical.
    The tags are arranged in the same order of the drawing steps in the procedure, such that each tag corresponds to a single drawing procedure. The tags can take one of the following values:

    • 'A': Both Verifiers agree with the Instructor.
    • 'V1': Only Verifier 1 agrees with the Instructor*.
    • 'V2': Only Verifier 2 agrees with the Instructor*.
    • 'VV': Verifiers agree with each other, but not with the Instructor.
    • 'F': Verifiers do not agree with each other, nor with the Instructor.

    *The index next to the Verifiers is meaningless. 'V1' or 'V2' just means that only one Verifier agrees with the Instructor, and the index helps us to track which one.
    For further details on the agreement and metrics please refer to the dataset section in our website and to our paper.

  • agreement_scores: a list of tuples similar to the agreement_tags but the agreement is calculated by the Board-Based F1 metric.
    A single tuple $(s_1, s_2, s_3)$ corresponds to a single drawing procedure where:

    • $s_1$ is the F1 score between the Instructor's and Verifier 1's executions.
    • $s_2$ is the F1 score between the Instructor's and Verifier 2's executions.
    • $s_3$ is the F1 score between the executions of both Verifiers.

    For further details on the agreement and metrics please refer to the dataset section in our website and to our paper.

  • drawing_procedure: A list of lists representing a drawing procedure. A sub-list [id, instruction, board state] stands for a single drawing step and consists of:

    • id: a number indicating the drawing step position in the procedure*.
    • instructions: the instructions written by the Instructor in that step.
    • board state: a list of 180 digits indicating the resulting board state following the execution of the instructions by the Instructor.
      The board size is 10 rows x 18 columns and is numbered left-to-right top-to-bottom as follow:



      The digits run from 0 to 7 and stands for colors as follow:
      • '0': white
      • '1': black
      • '2': yellow
      • '3': green
      • '4': red
      • '5': blue
      • '6': purple
      • '7': orange

    *The drawing steps are ordered from the first step to the last step starting with 0. The 0-step stands for the initial board state (i.e., a blank board) prior to the first set of instructions, hence, does not consist of any instruction (the instruction value is actually "NONE").
    Note that the length of a drawing procedure (i.e., number of drawing steps) does not refer to the 0-step (e.g., a drawing procedure in length of 6 consists of 6 drawing steps numbered from 1 to 6 in addition to the 0-step).

Citation

@article{hexagons,
  title={Draw me a Flower: {P}rocessing and Grounding Abstraction in Natural Language},
  author={Lachmy, Royi and Pyatkin, Valentina and Manevich, Avshalom and Tsarfaty, Reut},  
  year={2022},
  journal = {Accepted to Transaction of ACL}  
}

Terms of Use

  • By downloading the data on this page the user acknowledges that their use will be restricted to research and/or academic purposes only.
  • Resources on this page are licensed CC-BY 4.0, a Creative Commons license requiring Attribution (https://creativecommons.org/licenses/by/4.0/).

Changelog

  • 25/08/2022 The Hexagons dataset is released: TACL paper + dataset + website.
  • 27/06/2021 First arXiv version of the paper is released.