WebNLG Text-to-triples

The evaluation script for the Text-to-triples task for WebNLG. This script will link the candidate triples to the reference triples (based on what gives the highest average F1 score), and calculate the Precision, Recall, and F1 score metrics based on the metrics used for the SemEval 2013 task (see also this page for an explanation of the scoring types). Additionally, the Precision, Recall, and F1 score of the full triple will be calculated (based on Liu et al., 2018).

The script will also try to link every candidate attribute as good as possible to the reference attribute. Variations of the reference attribute will be interpreted as a longer string (if there are no other non-matching words before or after the matched reference), or as a separate guess (if there are).

The candidates xml should be formatted as:

<benchmark>
  <entries>
    <entry category="Airport" eid="Id19">
      <generatedtripleset>
        <gtriple>Aarhus | leaderName | Jacob_Bundsgaard</gtriple>
      </generatedtripleset>
    </entry>
    <entry category="Airport" eid="Id18">
      <generatedtripleset>
        <gtriple>Antwerp_International_Airport | operatedBy | Government_of_Flanders</gtriple>
        <gtriple>Antwerp_International_Airport | cityServed | Antwerp</gtriple>
      </generatedtripleset>
    </entry>
  </entries>
</benchmark>

To run this script, you need the following libraries:

These libraries can also be installed by running pip3 install -r requirements.txt

The command to use the script is: python3 Evaluation_script.py <reference xml> <candidates xml>

And to save the results as a json file: python3 Evaluation_script_json.py <reference xml> <candidates xml> <results json>

xc15071347094/WebNLG-Text-to-triples

WebNLG Text-to-triples