aboutcode-org/license-expression

Problem with exception symbols when using `get_spdx_licensing().validate()`

Opened this issue · 5 comments

Hi, I am trying to validate a given LicenseExpression using get_spdx_licensing().validate(). This is very helpful in providing a list of unknown symbols not on the SPDX License and Exception Lists. I encountered the problem, though, that exception symbols are also compared against the SPDX License List and licenses against the Exception list:

Example:

licensing = get_spdx_licensing()
le = licensing.parse("389-exception with MIT")
get_spdx_licensing().validate(le)

yields:

ExpressionInfo(
    original_expression='389-exception WITH MIT',
    normalized_expression='389-exception WITH MIT',
    errors=[],
    invalid_symbols=[]
)

As 389-exception is not a license and MIT not an exception, I would expect an error here. Furthermore, I would find it helpful if there were two separate lists for invalid_symbols: For example, one invalid_license_symbols and one invalid_exception_symbols.

I think this should be:

licensing = get_spdx_licensing()
le = licensing.parse("389-exception with MIT", validate=True, strict=True)

I get:

Traceback (most recent call last):
  File "/home/ayansinha/nexB/write_access/scancode-toolkit/venv/lib/python3.8/site-packages/license_expression/__init__.py", line 539, in parse
    tokens = list(self.tokenize(
  File "/home/ayansinha/nexB/write_access/scancode-toolkit/venv/lib/python3.8/site-packages/license_expression/__init__.py", line 603, in tokenize
    for token in tokens:
  File "/home/ayansinha/nexB/write_access/scancode-toolkit/venv/lib/python3.8/site-packages/license_expression/__init__.py", line 1070, in replace_with_subexpression_by_license_symbol
    raise ParseError(
boolean.boolean.ParseError: A license exception symbol can only be used as an exception in a "WITH exception" statement. for token: "389-exception"

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ayansinha/nexB/write_access/scancode-toolkit/venv/lib/python3.8/site-packages/license_expression/__init__.py", line 547, in parse
    raise ExpressionParseError(
license_expression.ExpressionParseError: A license exception symbol can only be used as an exception in a "WITH exception" statement. for token: "389-exception"

If I do :

licensing = Licensing()
le = licensing.parse("389-exception with MIT", validate=True)

I get:

>>> le = licensing.parse("389-exception with MIT", validate=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ayansinha/nexB/write_access/scancode-toolkit/venv/lib/python3.8/site-packages/license_expression/__init__.py", line 559, in parse
    self.validate_license_keys(expression)
  File "/home/ayansinha/nexB/write_access/scancode-toolkit/venv/lib/python3.8/site-packages/license_expression/__init__.py", line 466, in validate_license_keys
    raise ExpressionError(msg)
license_expression.ExpressionError: Unknown license key(s): 389-exception, MIT

Thanks for the quick response, @AyanSinhaMahapatra, I did not think of the additional parameters during parsing.

Unfortunately, this does not solve my particular use case.
Let me be more precise: I need to validate a given LicenseExpression, not a given String.

Your example works if I get a String as input, but let's assume I have to validate the expression that someone generated by using this code (of course, they could have used the validate=True flag, but suppose they didn't):

licensing = get_spdx_licensing()
le = licensing.parse("389-exception with MIT")

So the only thing I get is le and now I have to validate it.
After your examples I'm even more of the opinion that get_spdx_licensing().validate(le) should not show me an empty errors list.

@AyanSinhaMahapatra @pombredanne
Any chance this might get resolved in the near future? :)

@armintaenzertng help would be much welcomed!

@armintaenzertng you can go back to the string object by using le.render() and then use Licensing().parse(expression, validate=True) .

I think the intended use here was, when we are validating a license expression, we are checking if the expression type and syntax is okay, and when we are parsing from a string and validating there, we check the license keys for validation (and only create the LiceseSymbols if they are valid keys). But this can be implemented there too probably with a argument like validate_keys or similar?

@pombredanne what do you think?