/EDUKG

EDUKG: a Heterogeneous Sustainable K-12 Educational Knowledge Graph

Primary LanguagePython

EDUKG

EDUKG: a Heterogeneous Sustainable K-12 Educational Knowledge Graph

EduKG is proposed and maintained by the Knowledge Engineering Group of Tsinghua Univerisity. It is a heterogeneous, sustainable K-12 educational knowledge graph with an interdisciplinary and fine-grained ontology. This repository consists of 38 well-constructed knowledge graphs (more than 252 million entities and 3.86 billion triplets) under the ontology.

In general, our contributions are summarized as follows:

  1. An interdisciplinary, fine-grained ontology uniformly represents K-12 educational knowledge, resources, and heterogeneous data with 635 classes, 445 object properties, and 1314 datatype properties;

  2. A large-scale, heterogeneous K-12 educational KG with more than 252 million entities and 3.86 billion triplets based on the data from massive educa- tional and external resources;

  3. A flexible and sustainable construction and maintenance mechanism empowers EDUKG to evolve dynamically, where we design guiding schema of the construction methodology as hot-swappable, and we simultaneously monitor 32 different data sources for incrementally infusing heterogeneous data.

Updates

May.13rd:

  • The first version of our repository is officially online!!!
  • New heterogeneous database parsed and uploaded: NBSC, BioGRID

Repository Framework

The framework of our work is illustrated as below:

EDUKG Overall Architecture New

This repository contains all the resources in the EduKG as well as the toolkits developed by which the resources are collected. This ongoing project will be maintained with special care to ensure that the users can obtain fresh and timely resources for intelligent education.

Resources

The resources of EduKG mainly consists of three parts: knowledge topics, educational resources and the external heterogeneous data. All of them are inter-connected by the fine-grained, unified ontology.

The knowledge topics resources

Knowledge Topic Resrouces Description Download Size Last Update
Main Concepts The main concepts involved in K-12 education main.ttl 16.8M May.18th,2022

The educational resources

Educational Resources Description Download Size Last Update
Exercises The exercises and exams of Chinese K-12 education since 2017 exercise.ttl N/A May.5th,2022
Reading Material The supplimentary and mandatory reading materials for Chinese K-12 education material.ttl 3.7M May.5th,2022

Note: Due to copyright issues, the textual content of the reading material will not be published


The external heterogenous resources

External Heterogenous data Description Download Size Last Update
BioGRID An Online Interaction Respository With Data Compiled Through Comprehensive Curation Efforts. biogrid.zip 52.8M May.5th,2022
UniProtKB The central hub for the collection of functional information on proteins uniprot.zip 410.8M May.5th,2022
New York Times An American daily newspaper based in New York City with a worldwide readership nytimes.zip 544.6M May.5th,2022
NBSC Public statistical data provided by National Bureau of Statistics of China NBS.zip 8.9M May.5th,2022
HowNet An online common-sense knowledge base unveiling inter-conceptual relations and inter-attribute relations of concepts HowNet.zip 8.2M May.5th,2022
WordNet A large lexical database of English Xingyu N/A May.19th,2022
Framester A frame-based ontological resource acting as a hub between linguistic resources framester.zip 969.7M May.5th,2022
GeoNames The GeoNames geographical database covers all countries and contains over eleven million placenames geoNames.zip 459.8M May.5th,2022

Toolkits

The toolkits for construction the EduKG is provided below:

Name Description Jump to
Knowledge Candidate Extraction TBD To be uploaded
Concept Expansion TBD To be uploaded
Entity Linking TBD To be uploaded
Entity Alignment Aligned entity with Wikidata and XLORE entity by neighborhood information edukgea
XML2TTL Parser A modulized tool for parsing XML into knowledge graph with ontology xml2ttl
Rhetorical Role Typing Rhetorical role typing based on dependency tree rhetyper

Reference