/vulnerability-data-tools

Primary LanguagePythonApache License 2.0Apache-2.0

Vulnerability Data Tools

This is a project that's meant to help enrich vulnerability data. The project was created due to the uncertain nature of NVD. There are currently two repos. This repo to host the tooling needed to parse vulnerability data. We will also use this repo for discussions and planning.

Please feel free to browse the current issues and submit new ones with ideas and questions. There is also a public Slack channel that can be used for questions and comments #vulnerability-data-project.

The output of these tools is the nvd-data-overrides project. This repo specifically holds override data in the NVD format. This repo should only hold data, no planning or discussion should happen here.

The NVD Data Overrides repo hosts a FAQ that answers some of the questions about how this project will work now and in the future. The FAQ is very specific to NVD, while this repo will be more broad than just NVD.

Project layout

nvd - Scripts that turn the cve5 data into NVD compatible CPE data. Please see the readme in that directory for more details.

Future efforts

We have a Google Document that describes some ideas and concepts for a later vulnerability enrichment project. Long term we would like to provide vulnerability enrichment in a much more sustainable way. The data in this repository will be included in the future efforts, so the work is not wasted effort.

https://docs.google.com/document/d/1ccW_ng9HVwuTWiL2dGC5Tqb_CKef6pAEwRQ4tg_aDgw/edit#heading=h.7lelh5vxqxu4

Feel free to add comments to that google document. Over the coming days it will be migrated here.

We have a lot of ideas on how to do this better in the future. We envision a data format capable of generating the data currently stored in this repository. The NVD format is very constrained. By capturing the same data but formatting it in a nicer way, it will be possible to output any format needed. NVD, OSV, cve5, and more. Think of this repository as a place to learn what we don't know yet.

Regardless of the data format used, it can be expected that this override data will be generated and available for the forseeable future.