Automated construction of knowledge graphs (KG) remains an expensive technical challenge that is beyond the reach for most enterprises and academic institutions. NOUS is an end-to-end framework for developing custom knowledge graphs driven analytics for arbitrary application domains. The uniqueness of our system lies A) in its combination of curated KGs along with knowledge extracted from unstructured text, B) support for advanced trending and explanatory questions on a dynamic KG, and C) the ability to answer queries where the answer is embedded across multiple data sources.
What does NOUS mean? "The capacity to reason with experiential knowledge." See here and there.
NOUS provides complete suite of capabilities needed to build a domain specific knowledge graph from streaming data. This includes
- Natural language processing(NLP),
- Entity and relationship mapping,
- Confidence Estimation using Link Prediction.
- Rule Learning/Trend Discovery using Frequent Graph Mining
- Question Answering using Graph Search
- Choudhury, S., Agarwal, K., Purohit, S., Zhang, B., Pirrung, M., Smith, W. and Thomas, M., 2017, April. Nous: Construction and querying of dynamic knowledge graphs. In Data Engineering (ICDE), 2017 IEEE 33rd International Conference on (pp. 1563-1565). IEEE. paper slides
- Zhang B, S Choudhury, M Al-Hasan, X Ning, P Pesantez, S Purohit, and K Agarwal. 2016. "Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs." In 2016 SIAM Data Mining Workshop on Mining Networks and Graphs: A Big Data Analytic Challenge. paper
- Choudhury S, K Agarwal, S Purohit. 2016. "Navigating the Maps of Science" slides
- Choudhury S, C Dowling. 2014. "Benchmarking Named Entity Disambiguation approaches for Streaming Graphs." technical report
- TripleExtractor: Contains NLP code, takes text document as input and produces triples of the for subject, predicate, object
- EntityDisambiguation : Entity linking to a given KG, (implements the algorithm in Collective Entity Linking in Web Text: A Graph-based Method, Han et al, SIGIR 2011)
- Mining : Given a streaming graph, find frequent patterns
- Search : Given a attributed graph and entity pairs, return all paths
- Link Prediction: Confidence estimation of each link in the graph using Naive Bayes
- Java 1.7 OR above
- Maven 3.0 or above
- Apache Spark 2.0 OR above
- Scala 2.10
- HDFS File System (Optional)
Clone github repository
git clone https://github.com/streaming-graphs/NOUS.git NOUS
All NOUS modules (except LinkPrediction) are written in scala and can be compiled with maven. LinkPrediction is written in Python and can be run directly. Perform maven build in any of the module : TripleExtractor
OR Mining
Ex:
cd [Repo_Home]/TripleExtractor
mvn package
Here [Repo_Home]
is the path to your cloned directory NOUS
.
NOUS is organized into multiple modules that support the KG workflow. Each module contains README and data to run the examples. Refer to module's README for further details.
Hypothesis Generation using Deep Learning