aboutcode-org/scancode-toolkit

Add License Category to ScanCode outputs and tools

Opened this issue · 0 comments

It seems to be time to add ScanCode License Categories to the standard SCTK JSON output and provide it downstream to SCIO and SCWB. There are no standards for License Categories from SPDX or CycloneDX. The OSI Categories (https://opensource.org/licenses/) are not really intended to help you understand the license conditions. We may want to post an RFC on the topic because we need to consider that ScanCode License Categories could become a defacto standard.

The reason to include them in the standard SCTK output is to make it easier to organize and prioritize an analysis of license detections by logically grouping the licenses by Category - e.g. prioritze analysis of Copyleft-licensed items over Permissive-licensed.
There will be significant design work to decide how to generate the initial License Category for a License Expression and then streamline as much as possible with simple rules. Possibly:

  1. Generate a License Category "expression" based on the licenses in the LICENSE EXPRESSION keeping the License Expression notation (AND, OR, WITH, parends)
  2. Deduplicate the easy cases: e.g. Permissive AND Permissive => Permissive (there are usually many such cases
  3. Define rules for translating common WITH cases to a single Category - see #2897
  4. Then work on more rules based on patterns we see for popular FOSS projects and packages.