This repo aims to crawl data from github network of CVE-related directories Then, to construct the relationship graph into neo4j schema.
-
Install requirements, run cmd:
pip install -r python_requirement.txt
-
Initialize neo4j
-
Download neo4j source to a folder
<NEO4J-HOME>
-
Set default admin(1), run cmd:
<NEO4J-HOME>/bin/neo4j-admin set-default-admin neo4j
-
Run neo4j console, run cmd:
<NEO4J-HOME>/bin/neo4j console
-
Open web browser(2) and go to localhost:7474 to visit Neo4j Browser tool.
Login to neo4j (default username: neo4j, default password: neo4j)
-
Create new user, run cypher:
CREATE USER dat1 SET PASSWORD '1'
-
Assign roles for user, run cypher(1):
GRANT ROLE admin to dat1
-
Create new database(1):
CREATE DATABASE git2neo
Or import current databases.
-
-
Generate new Github Personal access tokens with scopes:
user repo public_repo repo_deployment repo:status read:repo_hook read:org read:public_key read:gpg_key
Replace the token in private_config.py
- Run github Query
-
Run cmd:
python git2neo.py
-
- Run Analysis functions
- (1) Only Enterprise edition has these functionalities, Community edition has to use default single database.
- (2) Sometime, neo4j browser show nothing. This seems to be a bug. Move between web browsers like Chorme, EE, FireFox,... might help.
- (3) More sample queries a described in cypher_sample_queries.docx
A great appreciation for USC GRID-CKID Fall 2020 github-cve-social-graph Team for developing crawling tools that are utilized in this project.