Better Licensing
Opened this issue · 2 comments
bbengfort commented
We should probably explore our licensing options, on interesting website is: OCLC which mentions Open Data Commons and the legal tools they provide for licensing.
We should also explore creative commons, as well as proprietary or restricted licenses as well.
Along with this, we probably want to make the text of licenses Markdown, and we may even want to automatically populate them with variables (e.g. using a templating engine)
lauralorenz commented
Some research directions I went:
- "Database" licenses provided by Open Data Commons (which btw apply to any data, not just DBMS or DBMS-derived files)
- a totally open license, full name Open Data Commons Open Database License (ODbL), short name ODBL, full legal text for 1.0 here
- a share alike type license, full name ODC Public Domain Dedication and Licence (PDDL), short name PDDL, full legal text for 1.0 here
- a share alike with attribution type license, full name Open Data Commons Attribution License (ODC-By), short name ODC-By, full legal text for 1.0 here
- Copyright-extension licenses for any work provided by Creative Commons. Apparently there are some caveats in CC 3.0 for datasets that don't implicate copyright (see https://wiki.creativecommons.org/wiki/Data_and_CC_licenses and https://creativecommons.org/2011/02/01/cc-and-databases-huge-in-2011-what-you-can-do/)
- a totally open license with attribution
- a share alike license with attribution
- a no derivative license with attribution
- non-commercial versions of each of those
- and total public domain (CC0) which is basically totally open with no attribution
- Other licenses I see a lot on Github, by looking at the Github supported licenses
- didn't read too many details but this is like Apache, GPL, MIT. Should give these another read to make sure they are not software-only.
From an implementation perspective I also did some research. Here are some buckets in terms of those:
- store these ourselves
License
objects, as we've provided a model for it, and deal with any parsing/hosting ourselves, possibly including converting them into some template format we know we write logic to interpolate from - Use some other project's API to host and interpolate variables for different licenses
- Here's a project with template-friendly versions of many licenses for parameterizable year, organization, and project; and which is itself a CLI tools for license template variable substitution: https://github.com/licenses/lice. Written in python so we could creep on the API ourselves; there is also a js port (https://github.com/licenses/lice-js) and lua port (https://github.com/licenses/lice-lua)
- Use some other project's API to host licenses
- Github itself also has a license-chooser app, the codebase of which is available (https://github.com/github/choosealicense.com) including the txt version of licenses they support (in https://github.com/github/choosealicense.com/tree/gh-pages/_licenses) with some metadata encoded in a format they maintain (e.g. https://github.com/github/choosealicense.com/blob/gh-pages/_licenses/wtfpl.txt#L1-L20)
- Speaking of, github hosts a Github License REST API which, though ostensibly really intended for project search, you can use to get all the licenses Github supports: https://developer.github.com/v3/licenses/, so we could use this as a farm/host for licenses if we like. In that same vein Creative Commons has a REST API too though it looks a little scary: https://api.creativecommons.org/docs/
lauralorenz commented
Ok did some more reading to answer some of the questions I left above.
- Regarding using Creative Commons for data: turns out the caveats related to CC licenses for data/databases have been resolved in the latest CC version 4.0 (https://creativecommons.org/share-your-work/licensing-considerations/version4/)
- Popular licenses on github include MIT, Apache 2.0, and GNU GPLv3; these are considered 'software licenses' as the legalese refers to 'source' but which Github's choosealicense.com recommends for non-software as well particularly for works that can be edited and versioned as source. Otherwise, they recommend CC version 4.0 licenses for data or the data-portions of a mixed-license project. They have a great visual appendix of the licenses they support and a snapshot of what they mean here: https://choosealicense.com/appendix/
- The US Federal government's best-practices project (Project Open Data) also recommends CC and Open Data Commons licenses for use for data, as well as the GNU Free Documentation License (which is basically an open license for "Documents" of which some data might be considered). (https://project-open-data.cio.gov/open-licenses/)