LLNL/scraper

Redundant License Information (GitHub)

LRWeber opened this issue · 0 comments

It looks like some amount of license information is manually maintained and provided by scraper/github/util.py.

Complete license listings are readily available from GitHub and seem to make this script redundant.

via the REST API
https://api.github.com/licenses

or via the GraphQL API

{
  licenses() {
    spdxId
    name
    url
  }
}

Perhaps we could re-write scraper/code_gov/models.py and any other relevant code such that we wouldn't have to maintain these hard-coded mappings?

(Suggested in #37 )