/Insights-Knowledge-Graphs

Investigating contextual information and coreference relationship in complex text data

Primary LanguagePython

Insights-Knowledge-Graphs

Members: Jenny Chen (Stats 21'), Ziwei Gu (CS/Math '21), Tushar Khan (CS 22'), Kevin Ngo (CS 21'), Oscar So (CS 22')

Introduction

Knowledge graphs are a means of storing and using data, which allows people and machines to better tap into the connections in their datasets. They capture entities, attributes, and relationships. Equipped with context, it has the potential to power insights into complex systems, with applications in recommendation systems, query expansion, question-answering, etc. However, few existing work has utilized such a tool for contextual information. We believe knowledge graphs will help AI systems reach their full potential in tasks ranging from information retrieval to machine reading and comprehension. As a higher-level application, we will evaluate our model on the SQuAD 2.0 Question-Answering dataset.

Keywords: Information Retrieval, Named Entity Recognition, Relation Extraction, Coreference Knowledge, Data Visualization, Recurrent Neural Networks (RNN), Question Answering.

Objectives

  1. Constructing extractive and query-specific knowledge graphs based on a collection of Wikipedia/ scientific articles.
  2. Creating an interactive visualization that demonstrates the relationship among entities/ concepts.
  3. Incorporating knowledge graphs as an additional information source into RNN models for question answering tasks.

Previous work

  • Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction. Mari Ostendorf, et al. (EMNLP 2018)

  • One-Shot Relational Learning for Knowledge Graphs. WenHan Xiong et al. (EMNLP 2018)

  • Reading Wikipedia to Answer Open-Domain Questions. Danqi Chen, et al. (ACL 2017)

  • The language representation model BERT (https://github.com/google-research/bert)

Timeline

  • Build sentence(paragraph)-level knowledge graph prototypes (by Feb. 17)

  • Create initial visualization (by Mar. 3)

  • Finish multi-document knowledge graphs (by Mar. 30)

  • Complete initial Neural Networks (by Apr. 21)

  • Complete final visualization and competitive deep learning models (by May 18)