aboutcode-org/scancode-workbench

View the matched text of a license match

Closed this issue · 4 comments

Description

When I review a license scan with license matches (and --license-text) I would like to view the actual match text for each match

As a refinement, I would like to also see highlighted the non matched parts by displaying a visual diff against the matched RULE text.

With this feature, I would be able to validate the correctness of a license match without having the scanned codebase on hand.

Notes: for the diff, we could use diff-match-patch that has both a Python and a JS implementation.

@OmkarPh this would be super useful! @LeChasseur ping... this would allow the review of scan matched license text, just using the JSON scan as an input and without having the scanned code on hand.

This would be a major new dataset to add to SCWB so the design and implementation work will be substantial.
Also the storage and processing of all all license text from a Scan in the SQLite file may push the limits for SCWB.
Perhaps this should be a feature for SCIO instead.
In any case, I do not think that it is feasible to include this as a feature for v4.

@mjherzog:

a major new dataset to add to SCWB so the

We have license reference data optionally now in SCTK output, so that will be used to get the RULE text, so we won't need to add the ability to query from LicenseDB.

design and implementation work will be substantial.

For version 4 we were only planning a simple clickable modal that would show the matched_text for that match (if we had --license-text enabled for that scan), and this should be simple enough to implement, and useful as we don't have to visit the JSON license detection data below for this.

As @pombredanne said the diff functionality would be a refinement on this which can be revisited if required later (and not in version 4). This also will use everything from the JSON data, the only task would be adding the diff-match-patch library, calling it properly and then adding some basic text highlighting (I'm not sure if this is available in the JS library, we could use other libraries for this too) accordingly.

Closing this, as both the matched text viewer & diff are implemented.
All corner cases mentioned in the issue are also handled. Any additional ones can be handled in separate issues