Graph database version of the NVD CVE database by Jim Jazwiecki and Nathan Kluth, built using the official Neo4j Docker image (GitHub). Please post feedback/questions as GitHub issues.
- Run
docker-compose build
to download the Neo4j Docker image and install the APOC plugin - Run
docker-compose up
to start Neo4j - Run
docker exec -t nvdgraph python3 code/nvd_loader.py
to load the NVD data feeds
Run docker-compose up
to start the container. Open http://localhost:7474/browser/
to start an interactive browser-based session. By default, there is no authentication
set, but it can be set in Docker from your local environent by setting the
NEO4J_AUTH
environment variable to a username/password
pair. For example:
export NEO4J_AUTH="admin/password"
There are two different Python environments used in this project. code/requirements.txt
is used by the Neo4j container to load the NVD data to Neo4j, but doesn't include the
Python Neo4j driver, because it is not needed by nvd_loader.py
. To work with
the Neo4j database locally using Python and the official Neo4j Python package,
create a virtual environment and run pip install -r requirements.txt
.
The following will work out of the box, unless you've set the NEO4J_AUTH environment variable:
driver = GraphDatabase.driver('bolt://localhost')
Then, to create a new session:
s = driver.session()
To run a query from within a session:
s.run("MATCH n RETURN n LIMIT 25")
Details on specific field mappings are code/loader-template.cypher
, but broadly the
schema is as follows:
(AttackVector)
-[:ATTACKABLE_THROUGH]-
(CVE)
-[:AFFECTS]-
(ProductVersion)
-[:VERSION_OF]-
(Product)
-[:MADE_BY]-
(Vendor)
https://nvd.nist.gov/vuln/data-feeds lists all feeds of data. nvd_loader.py
will
fetch this page, parse it to identify complete (i.e. not partial/update) v1.0 gzipped
JSON feed URLs, fetch the files one by one, unpack, and load them using the template
specified in loader-template.cypher
.
First, the script will set uniqueness constraints on our nodes, implicitly creating
indices. $nvd_file_name
will be replaced during execution with the name of the
JSON file being loaded. Then we loop over all the individual CVEs.
Run this command to load a test file, code/nvd-test-samples.json
, which contains
a couple sample CVEs:
CALL apoc.load.json('file:///var/lib/neo4j/code/nvd-test-samples.json') YIELD value AS nvd
UNWIND nvd.CVE_Items as vuln
RETURN vuln
See https://neo4j-contrib.github.io/neo4j-apoc-procedures/
for more details on apoc.load.json
.
Calculate the number of vulnerabilities found in the same software with the same impact score:
MATCH
(vendor : Vendor {name : 'microsoft'})
−[:MADE BY]−
(product : Product {name: 'edge'})
−[:VERSION OF]−
(product version:ProductVersion)
−[:AFFECTS]−
(cve :CVE)
RETURN cve.`v2.impact score`, count(cve)
Calculate the number of vulnerabilities found in the same software with the same impact score across two consecutive years:
UNWIND range(1988, 2018, 1) AS t
WITH
apoc.date.fromISO8601(t + '-01-01T00:00:00.000Z')
AS start_window,
apoc.date.fromISO8601((t + 1) + '-12-31T23:59:59.999Z')
AS end_window
MATCH
(vendor: Vendor)
-[:MADE_BY]-
(product:Product)
-[:VERSION_OF]-
(product_version:ProductVersion)
-[:AFFECTS]-
(cve:CVE)
WHERE
cve.published >= start_window
AND
cve.published <= end_window
RETURN
apoc.date.format(start_window)
AS start_window,
apoc.date.format(end_window)
AS end_window,
vendor.name,
product.name,
cve.`v2.impact_score`
AS impact_score,
count(cve) AS vulnerabilties
ORDER BY
start_window,
vendor.name,
product.name
Find count of vulnerabilities that require user interaction through the network, under the CVSS 3.0 definition of "attack vector"
MATCH
(attack_vector:AttackVector {name: 'NETWORK'})
-[:ATTACKABLE_THROUGH {cvss_version: 3}]-
(cve:CVE {`v3.user_interaction`: 'REQUIRED'})
RETURN count(cve)
Identify easy network-accessible exploits that don't involve user actions which simplify access to high-impact exploits which are otherwise high-complexity:
MATCH
(attack_vector:AttackVector {
name: 'NETWORK'
}
)
-[:ATTACKABLE_THROUGH]-
(cve:CVE {
`v3.scope`: 'CHANGED',
`v3.user_interaction`: 'NONE',
`v3.attack_complexity`: 'LOW'
}
)
-[:AFFECTS]-
(product_version:ProductVersion)
-[:AFFECTS]-
(escalated_cve:CVE {
`v3.privileges_required`: 'HIGH',
`v3.user_interaction`: 'NONE',
`v3.integrity_impact`: 'HIGH'}
)
-[:ATTACKABLE_THROUGH]-
(escalacted_vector:AttackVector {name: 'LOCAL'})
RETURN cve.name,
product_version.name,
escalated_cve.name