/awesome-kgc-tools

Links and description of Knowledge Graphs Construction Tools

Awesome KGC Tools

Links and description of Knowledge Graphs Construction Tools

KGC Materializers

  • Morph-KGC - R2RML and RML processor to generate RDF knowledge graphs from heterogeneous data sources at scale.
  • Chimera - Framework based on Apache Camel to define composable semantic data transformation pipelines (lifting/lowering to/from RDF)
  • RMLMapper - The RMLMapper executes RML rules to generate high quality Linked Data from multiple originally (semi-)structured data sources
  • RMLStreamer - The RMLStreamer executes RML rules to generate high quality Linked Data from multiple originally (semi-)structured data sources in a streaming way.
  • xls2rdf - converts Excel files containing a "magic line" into RDF.
  • Morph-xR2RML - Implementation of the xR2RML mapping language (extending R2RML and reusing RML terms) for MongoDB databases. Can be used to map JSON data but also any format that can be imported in MongoDB, in particular CSV/TSV. Was used in different projects to produce 2.4 billion triples so far.
  • SDM-RDFizer - An efficient scaled-up RML-compliant engine for knowledge graph construction from heterogenous data sources.
  • CARML - An extensible RML processor to generate RDF knowledge graphs from heterogeneous data sources.

KGC Virtualizers

  • Ontop - Ontop is a platform to query relational databases as Virtual RDF Graphs using SPARQL (R2RML)

KGC Pre-processors

  • MEL - (Metadata Extractor & Loader) - A tool to extract metadata (and textual content) from various file formats, as JSON objects.
  • Dragoman - An efficient RML+FnO-compliant engine for translating and executing complex functions in RML mapping rules and transfer the data integration system into a function-free one.
  • EABlock - A computational block to solve entity alignment over textual attributes in a knowledge graph creation pipeline.
  • FunMap - Efficient preprocessing of transformation rules described in RML+FnO mappings.
  • Excel in RML - RMLMapper extension to support Excel spreadsheets.

NLP for KGC

  • TNNT - (The NLP/NER Toolkit) - A tool that automates the extraction of categorised named entities from the unstructured information encoded in the source documents, using diverse NLP tools and NER models.

Mapping Specifications

  • RML - The RDF Mapping language (RML) is a mapping language defined to express customized mapping rules from heterogeneous data structures and serializations to the RDF data model.
  • Target in RML - Alignment between RML and Target to describe how your knowledge graph should be exported to one or multiple targets.
  • DataIO - Target, a formal model and a common representation for specifying how a Knowledge Graph should be exported to a given target
  • FnO - Function Ontology (FnO), a way to semantically declare and describe implementation-independent functions, and their relations to related concepts such as parameters, outputs, related problems, algorithms, mappings to concrete implementations, and executions.
  • YARRRML - YARRRML is a human readable text-based representation for declarative generation rules.
  • J2RM - J2RM mappings and its engine compose a tool to process mappings from JSON data to RDF triples guided by an OWL2 ontology structure.
  • xls2rdf - The documentation for the "magic line" of the xls2rdf converter.
  • xR2RML - xR2RML is a language for expressing customized mappings from various types of databases (XML, object-oriented, NoSQL) to RDF datasets.

Mapping Editors

  • JUMA - Jigsaw Puzzles for Representing Mappings
  • Mapeathor - Definition of Excel-based mappings and translation to [R2]RML mappings.
  • Matey - Matey is a web based editor for YARRRML rules.
  • RMLEditor - RMLEditor offers a Graphical User Interface to enable data publishers, who are domain experts, to model knowledge derived from heterogeneous distributed data.
  • RMLx Visual Editor - A web based editor for RML rules.
  • Square - SPARQL Queries and R2RML mappings Environment
  • Map-On - A web-based editor for visual ontology mapping for R2RML documents (DEPRECATED)

Mapping Generators

  • Spread2RML - Suggests RML mappings on messy spreadsheets.

KGC Pipelines

  • KGCP - "KG Construction Pipeline" - A suite of software artifacts to automate the creation of KGs from heterogeneous data sources.

KGC Evaluation

  • Data Sprout - Excel spreadsheet generator for evaluating KG construction.
  • GTFS-Madrid-Bench - Benchmark to evaluate performance & scalability of declarative KG construction engines