/kg-reconstruction-eval

Resources for evaluating the re-construction of RDF Reification

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

(Re)Construction Impact on Metadata Representation Models

We investigate two alternatives for re-constructing an existing graph to interchange between different metadata representation models. This work is useful when is needed to change the metadata representation in a pre-existing KG and the KG engineers responsible for the construction of a KG want to explore alternatives. We evaluate KG re-construction in four representations with (i) KG construction systems, that construct the KG from heterogeneous data with declarative mappings; and (ii) using CONSTRUCT queries from KG stored in triplestores.

workflow

Engines

We test the performance and scalability of a set of KG construction and triplestores:

KG Construction Engines:

Triplestores:

Evaluation resources: SemMedDB

Dataset

SemMedDB, the Semantic MEDLINE Database, is a repository that contains information of extracted biomedical entities and predications (subject-predicate-object triples) from biomedical texts (titles and abstracts from PubMed citations).

The tables that comprise SemMedDB are available for download as a relational database or CSV files. The data in this use case is licensed under the UMLS - Metathesaurus License Agreement, which does not allow for its distribution (Data may be accessed by obtaining an account with the UMLS licence here).

We perform the evaluation with this dataset structured in four metadata representations (Standard Reification, Named Graphs, N-Ary Relationships and RDF-star) and in four size scales (1K, 10K, 100K and 1M).

Mappings and Queries

All results (including fine-grained ones) can be found here.

Authors

  • Ana Iglesias-Molina (Ontology Engineering Group - UPM)
  • David Chaves-Fraga (Ontology Engineering Group - UPM)
  • Jhon Toledo (Ontology Engineering Group - UPM)