This repository contains a prototype extension for Google Chrome for web knowledge graph ingestion, the process of taking semi-structured data from normal websites and loading it into a knowledge graph.
The following three sections describe how to get started with this extension.
-
Install Google Chrome.
-
Create a new user profile for Google Chrome to ensure that no other extensions interfere with the one contained in this repository.
-
Clone this
git
repository and note the location of the folder that you clone it into:git clone https://github.com/schasins/web-knowledge-graph-ingestion.git
-
In the new Google Chrome user profile, paste
chrome://extensions/
into your URL bar to load a page that lists all your extensions. -
Enable developer mode. If developer mode is not currently enabled, click the toggle switch next to the text "Developer mode" to turn it on.
-
Click the "Load unpacked" button and select the folder that you cloned this
git
repo into. -
Click the puzzle piece icon that appears in the top-right of Chrome and pin the "Web Knowledge Graph Ingestion" extension.
This mini example will demonstrate how the extension can extract data from a highly-structured HTML table into a CSV.
-
In the Web Knowledge Graph Ingestion Chrome user profile, navigate to Wikipedia's list of tallest mountains.
-
Scroll down to the table titled "The 125 most topographically prominent summits on Earth"
-
Click the Web Knowledge Graph Ingestion extension in the top-right of Chrome.
-
Click the "Click here to enter data demonstration mode instead" button
-
Click any cell in the table.
When modifying this extension, you will need to reload it in Chrome to see the effects of your changes. Here is how to do so:
-
In the Web Knowledge Graph Ingestion Chrome user profile, navigate to
chrome://extensions/
-
Click the gray circular reload button in the bottom-right of the Web Knowledge Graph Ingestion extension.
-
Reload the webpage in which you want to use the extension.
-
Reopen the extension.
Any changes made to the extension should now be reflected in the version that's running.
The following resources may be helpful when working on this extension: