fkie-cad/nvd-json-data-feeds

Question: release auto-update script source code

Opened this issue ยท 8 comments

Would it be possible to share the script you use to query NVD API and auto-update the data on this repository?

At some point we can certainly release the code auto-updating this repo. However we would like to make it a little bit more robust first :-). Please give us some time for that.

I'm glad to re-open this issue once we're ready!

Thanks, and also thanks for the detailed explanation in #2.

The reason I'm asking for the script source is to be able to re-generate the data locally, or perhaps mirror the data in another repository in case you would happen to stop, for one reason or another, to cache the NVD data on this repo.

Anyway, thanks for the great work you are doing here!

@rhelmke Sorry to chime in this old issue: has there been any progress in making the mirroring
scripts available?

Additionally, we would also be very much interested in the scripts that aggregates the
individual CVEs into the daily feeds. Indeed, those feeds are short-lived; they are replaced
daily. As such, there is no possibility to do reproducible builds.

For example, in our project, Buildroot, we are tracking a regression
in our tooling, that occurred around 2024-02-07. Unfortunately, we can't validate when the
issue actually happened, because the CVE feed from that day is no longer available. Since this
is a git tree, we could easily reconstruct the feed from the individual entries, if the scripts
were available.

Hello @yann-morin-1998,

unfortunately there is still no release timeline for the software stack driving this repo. We are currently occupied with a lot of other projects and wouldn't be able to allocate the required resources at this time - I'm sorry.

Either ways the packaging code wouldn't help you guys to reconstruct any daily packages from this repo's history. This is because the code also uses our OpenSearch backend and is no standalone script.

However, we certainly see and understand the issues you guys are faced with in terms of reproducibility. In fact, the idea to provide companion scripts that are able to reliably reconstruct historical packages has been around for a while. I assume that we could provide such a script and verify its correctness in manageable time. Give us maybe a week and we'll see what we can do :-).

On another note, we also thought about not wiping historical release packages, but refrained from the idea because it would certainly create a lot of duplicate data to host. And it is truly unnecessary considering that a companion script could use the git history for reconstruction.

@rhelmke Thanks for the feedback, and thanks for considering our request. That's very much appreciated.

How open are you to contributions? I have been playing on a little python script here, that walks the individual CVE directories in the repository, and generates reproducible yearly archives. It's working now, and just needs a little eye-candy. Shall I open a PR?

@yann-morin-1998 thank you very much! We're of course open to PRs and would really appreciate it. But I'm not quite sure if this is the right repository for it. I thought about a tool that would take an ISO date as input, automatically clone the repo, check out the correct commit, and then recreate the packages. It might be better to move the script to another repo such that it does not have to sit in the working file tree.

Let me talk to a colleague of mine, he might be able to quickly throw together a python package for that. I'll (or he'll) let you know how he'd like to proceed :-).