Support Attribution generation from SPDX license expressions in an [INPUT] file
mjherzog opened this issue · 4 comments
If you want to generate an Attribution Notice (attrib function) from an SBOM or other [INPUT] file that contains SPDX data for license expressions, that file will need to include LicenseRef and license_file or notice_file data for every license that does not have a License Identifier in the SPDX License List. (Note: ScanCode already the second requirement with pre-defined LicenseRef-scancode license identifiers)
There are some open design questions:
- Do we need to validate that every LicenseRef from the [INPUT] file has at least a license_file or notice_file (or both)?
- Do we display the LicenseRef as an SPDX License Identifier for traceability?
- What validation against the SPDX License List is required for asserted SPDX License Identifiers (not including LicenseRef cases)?
Here is my suggestion:
We can use https://scancode-licensedb.aboutcode.org/index.json to convert the SPDX License identifier back to the ScanCode license key. If a particular "spdx_license_key" cannot be located in the index.json (due to an invalid license identifier or a custom LicenseRef), we can handle it as follows: If the license_file/notice_file field is populated, issue a warning indicating that the spdx_license_key could not be found in the licenseDB but the license_file/notice_file field is filled. Alternatively, if there is no license_file/notice_file, raise an error indicating that the spdx_license_key was not found and no license_file/notice_file is filled.
and for
Do we display the LicenseRef as an SPDX License Identifier for traceability?
I'll say yes to not alter what user's input.
After some thought, actually I think the tool should not need to validate the license_file/notice_file for 2 reasons:
- Having a license_file/notice_file may not neccessary mean that the license/notice file is referencing the "invalid" license
- The tool fetch the license from DJE/LicenseDB, if we cannot translate the spdx_license back to the scancode_license, it is true to say "invalid" in a sense that it cannot be found in the database, and it's user's responsibilty to make sure their custom license has porper license_file/notice_file filled.
Here is the behavior of the latest code in "513_attrib_from_spdx" branch:
spdx_license_expression: LicenseRef-scancode-libzip AND ((AFL-1.5 OR BSD-2-Clause-Views) AND LicenseRef-scancode-bsd-1988) OR AFL-1.3
output:
license_expression: bsd-new AND ((AFL-1.5 OR bsd-2-clause-views) AND bsd-1988) OR AFL-1.3
with error:
Command completed with 2 errors or warnings.
ERROR: <path> : Invalid 'license': AFL-1.5
ERROR: <path> : Invalid 'license': AFL-1.3
@mjherzog @DennisClark Let me know what do you think?
Fixed in 27b3068