/EventKG-Click

A public cross-lingual dataset as a source to train and evaluate novel models for event-centric cross-lingual user interaction.

Primary LanguageJupyter NotebookCreative Commons Attribution 4.0 InternationalCC-BY-4.0

About Dataset

EventKG+Click is a novel cross-lingual dataset that reflects the language-specic relevance of events and their relations. This dataset aims to provide a reference source to train and evaluate novel models for event-centric cross-lingual user interaction, with a particular focus on the models supported by knowledge graphs. EventKG+Click Dataset is based on two data sources:

  1. the Wikipedia clickstream that reflects real-world user interactions with events and their relations within language-specic Wikipedia editions; and
  2. the EventKG knowledge graph that contains semantic information regarding events and their relations that partially originates from Wikipedia.

EventKG+Click consists of two subsets:

  1. EventKG+Click_event which contains relevance scores, location-closeness, recency and Wikipedia link count factors for more than 4 thousand events; and

  2. EventKG+Click_relation with nearly 10 thousand event-centric click-through pairs, and their langugae specific number of clicks, relation relevance and co-mentions of the relation which is the number of sentences in whole Wikipedia language editions that mentions both the source and target.

You can find a complete step by step walkthrough the process of EventKG+Click creation here.

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

If you find EventKG+Click dataset useful for your research, please consider citing the following paper:

 @article{abdollahieventkg+,
 title={EventKG+ Click: A Dataset of Language-specific Event-centric User Interaction Traces},
 author={Abdollahi, Sara and Gottschalk, Simon and Demidova, Elena}
}