rdedhia/dse203_final

Jupyter Notebook

Run the crawler in wiki_crawler

scrapy crawl -s DEPTH_LIMIT=1 wiki_entities
outputs entities.json

Run the notebooks in NLP folder

Run scrape_wiki.py to generate json files for each of the main companies
Run add_triples_from_text.ipynb and add_more_triples.ipynb to extract new triples from those json files
Output all_entities.json

copy all_entities.json into the BuildlingNeo4jTables folder
Run the notebook generate_tables_CSVs.ipynb in the BuildingNeo4jTables folder
Copy all CSV files into Neo4j database's "import" folder
Open Neo4j_Create_scripts.ipynb and execute the commands to generate graph
Execute queries in neo4j_queries.txt