Check licenses of used libraries
mikegerber opened this issue ยท 24 comments
dinglehopper is Apache-licensed. All libraries used as libraries need to have a compatible license, e.g. BSD, MIT, Apache or public domain. GPL-licensed programs used seem to be fine. See also #48 for a relevant discussion.
Checklist from requirements*.txt
:
- click
- jinja2
- lxml
- uniseg
- numpy
- colorama
- MarkupSafe
- ocrd >= 2.20.1
- attrs
- multimethod == 1.3
- tqdm
- pytest
- pytest-flake8
- pytest-cov
- pytest-mypy
- black
click: BSD License (BSD-3-Clause)
jinja2: BSD License (BSD-3-Clause)
lxml: BSD License (BSD)
uniseg: MIT License (MIT)
numpy: BSD License (BSD)
colorama: BSD License (BSD)
MarkupSafe: BSD License (BSD-3-Clause)
ocrd: Apache License 2.0
attrs: MIT License (MIT)
multimethod: Apache Software License
tqdm: MIT License, Mozilla Public License 2.0
pytest: MIT License (MIT)
pytest-flake8: BSD License (BSD License)
(We are also not linking to it.)
pytest-cov: BSD License (MIT)
(We are also not linking to it.)
pytest-mypy: MIT License (MIT)
(We are also not linking to it)
black: MIT License (MIT)
(We are also not linking to it)
All libraries used use - to the best of my knowledge - compatible licenses. ๐
@b2m expressed interest in creating a CI job to regularly check for licensing problems (#48), so I am reopening.
My notes:
pip-licenses --allow-only="MIT License;BSD License;Apache"
(addtion of Apache is untested) seems to be an interesting approach- I like that it's a whitelist
- I like that it also checks transitively (by checking everything installed by pip)
- It would be nice to keep this out of the usual test suite due to (possible) network activity (I haven't checked where it gets the license info from, though)
- I might do the check in ocrd-galley eventually, because the builds there are network heavy already, but it does not hurt to explore software options in this project's CI
(Keeping it short as I am on my free day actually ๐ )
(Keeping it short as I am on my free day actually ๐ )
Free day? Same as yesterday and the day before yesterday... ๐
I have three proposals, let me know which one(s) you'd like to try:
licensed
- https://github.com/github/licensed
- Provided by GitHub
- Available as GitHub Action
- Supports multiple technologies
- Integrated in GitHub with Pull Requests and Branches
- Using configuration files
LicenseFinder
- https://github.com/pivotal/LicenseFinder
- Provided by pivotal
- Available as Docker Image
- Supports multiple technologies
- Integrated approval process
- Using configuration files
pip-licenses
- https://pypi.org/project/pip-licenses/
- Python (only) replacement for LicenseFinder (?)
- Supports only Python dependencies
Supports only whitelisting- No configuration file support
My thoughts:
- If it is ok to only check Python dependencies I would give pip-licenses a try as the setup is quite simple.
- If you want to check the licenses of other technologies as well (like the JavaScript dependencies in dinglehopper =) I would try licensed, as the integration in GitHub already is provided.
I have three proposals, let me know which one(s) you'd like to try:
While having support for JavaScript is certainly interesting, the tools seem to require Bower/Yarn/or npm(?) for that, and switching to that is maybe a bit overkill for the three JS dependencies :) (Might do it anyway because of #2 someday.)
pip-licenses seems to be a simple solution for Python dependencies, so maybe try that first ๐ If it can do the license checking offline from a previously set up venv, that would be the best case.
pip-licenses
* Supports only whitelisting
I thought --fail-on
supports blacklisting, but I'd prefer whitelisting anyway.
My thoughts:
* If it is ok to only check Python dependencies I would give pip-licenses a try as the setup is quite simple.
* If you want to check the licenses of other technologies as well (like the JavaScript dependencies in dinglehopper =) I would try licensed, as the integration in GitHub already is provided.
๐ Note that we're currently using CircleCI and while I'm not super passionate about it I am super passionate about not switiching CI systems every few months ;-)
pip-licenses seems to be a simple solution for Python dependencies, so maybe try that first +1 If it can do the license checking offline from a previously set up venv, that would be the best case.
It seems to work offline!
pip-licenses seems to be a simple solution for Python dependencies, so maybe try that first +1 If it can do the license checking offline from a previously set up venv, that would be the best case.
It seems to work offline!
In other words: From my point of view, this makes it suitable for the normal test suite
While having support for JavaScript is certainly interesting, the tools seem to require Bower/Yarn/or npm(?) for that, and switching > to that is maybe a bit overkill for the three JS dependencies :) (Might do it anyway because of #2 someday.)
Yes (packacking tool), Yes (overkill) and Yes (switching someday) =)
I thought --fail-on supports blacklisting, but I'd prefer whitelisting anyway.
No idea why I had this in my notes... striked it out in my original comment.
๐ Note that we're currently using CircleCI and while I'm not super passionate about it I am super passionate about not
switiching CI systems every few months ;-)
Why switching? Just use all in parallel ๐
Regarding the integration of pip-licenses:
- The "cleanest" way would be to introduce version pinning (maybe with the help of pip-tools) and only run license-checks when
requirements.txt
changes. - The "fastest" way (regarding integration) would be to run a license-check as extra step after each test run on each python-version.
- A compromise would be to have a license-check workflow restricted e.g. to the master branch.
I've added a pre-commit hook for this in 3233dbc.