n8willis/opentype-shaping-documents

License

amitdo opened this issue · 5 comments

Please add a license to your docs.

It is on the to-do list, but as with the other "experimental nature" concerns, we left it off initially to discourage people from citing and spreading it before it took final form — since it has no official status.

Been revisiting the license-selection topic recently, in light of the overall state of the repo. There are several important questions to ask for choosing a license for something that is designed to be a functional specification useful for outside implementations.

Here are some of the general categorical principles:

Off-the-shelf software licenses

Not an appropriate fit, at least among the major licenses that I have poked at. The use-cases defined in most software licenses focus on making and publishing modifications. Ideally, we do not want the "standard" to be forked or to circulate in multiple derived-work forms.

Also, it may go without saying, but just in case: it's likely a disadvantage to apply a copyleft(ish) license that would impede proprietary projects from implementing the same behavior. The implicit goal, which I suppose we ought to make explicit, is maximal compatibility, which means that there should be nothing preventing a proprietary app from using the docs as a schema to write a shaping engine from, because users benefit more when every implementation is compatible.

In addition, I think a lot of the standard-issue terms-and-conditions in FOSS software licenses (such as detailing how compiled binaries are distributed and making offers of source) are sufficiently irrelevant to a set of documents that they would only cause confusion.

Off-the-shelf documentation licenses

I have also batted around the Creative Commons licenses and the GNU Free Documentation License.

The CC license suite's terms are, roughly:

  • Attribution
  • ShareAlike
  • NonCommercial
  • NoDerivatives

Attribution is important because it permits us to include a reference back to the original copy (beyond that, it's not high on my list). NonCommercial, ShareAlike, and NoDerivatives all seem problematic to me, specifically because they would interfere with other people's ability to make translations or to quote from the documents in another work — most importantly, to copy and paste quoted sections in source code.

But Attribution-Only (CC-BY) seems vulnerable to the divergent-and-incompatible-forks problem. It would seem to be better to find a license that explicitly discusses the use of quotations in source files. For example, some of those quotations could be comments (to keep a description close to a function that implements it) but other quotations could be of functional bits like regular expressions.

The GNU FDL seems to be regarded (by third-parties) as difficult to implement, for a few reasons. One is the option of "invariant sections" that the authors can declare. I, for one, would personally not ever want to invoke that option, but when it's defined as part of the license itself, it does seem like downstream translators / later editors could tack on their own invariant sections, which would be a problem.

It also seems like the requirement to log all changes would be difficult to comply with. As things stand, there are several script sections that have not really been put to the full "independent second implementation" test yet, as well as the ever-changing nature of how each implementer handles things (e.g., just look at all the work that goes into HarfBuzz's support for Sinhala and SE Asian scripts); tracking all the changes in the documents could become a burden unto itself.

It also isn't clear to me that the "quotation in comments" issue is adequately addressed in the GFDL.

Upstream

Another approach entirely might be to just adopt a license that is as close to possible as the "upstream" specifications, so that at least they'd be maximally compatible. I think the chief difficulty there would be that OpenType/OFF and Unicode have starkly different licenses.

Mimic somebody else who does this a lot already

Yet another another approach would be to replicate the license(s) of an existing publisher of open-and-free software standards. The tricky bit here is that there are quite a few publishers (IETF, IEEE, W3C) and, by-and-large, their licenses are specific to the institution. That is to say: you cannot just start up a font project and say "this is released under the W3C Font Stuff License" ... not merely because the W3C hasn't published a vetted, meant-to-be-reusable license, but also because the W3C projects' licenses' terms-and-conditions directly reference the W3C.

Upshot

All that having been said, if anyone has an argument to make for a particular license, addressing the issues above, please make it. There are certainly standards-like projects that have adopted BSD-family or LGPL licenses after considering their own options. In my scouring of the books, though, it seems like most of those have been projects that are 50%+ code, rather than Virtually-100% non-code.

That's a good analysis, perhaps it suggests going in the opposite direction and listing the licensing criteria that we might want to apply to the docs -- some of which you have already mentioned here -- and letting that point the way?

That's a good analysis, perhaps it suggests going in the opposite direction and listing the licensing criteria that we might want to apply to the docs -- some of which you have already mentioned here -- and letting that point the way?

Yeah; makes sense. I would start with the following criteria:

  1. Permission to use, copy, analyze, redistribute for any purpose is granted as long as $SOME_DISCLAIMER and $ORIGINAL_LINK is included verbatim.

    • This would be the generic permissive-grant statement, needing to make clear that FOSS and proprietary implementations are permitted.
    • This should also cover the question of whether somebody could reproduce the entire thing or include it in a combined volume / download / other-distribution with other things. Since the grand goal is to fit into the gaps between Unicode and OpenType specifications, the easiest approach would be to be sure that these docs have as-few-or-fewer conditions (and/or less cumbersome ones) than both of those.
  2. Modification is permitted as long as the $ORIGINAL_LINK is retained and the modifications are noted in the Authorship preamble.

    • This is the "slight exception to generic-permissive-grant", by requiring the original link in order to allow downstream recipients to track changes.
    • An alternative to the "track changes by pointing upstream" would be that the changes sections are noted in the intro. AFAICT, this is what Python's 2.0 license does. Programming language licenses are instructive here, since they also care about compliance and compatibility between implementations. Noting changed sections seems a lot more feasible than noting every change (à la GFDL).
    • We want people to be able to work on forks, potentially long term (as would surely be necessary to hammer out some problems or to add some complex scripts); for that, pointing upstream and listing changes comes automatically when hosted on a public site like GitHub, but could necessitate a preface in a printed/exported document.
  3. A specific clause noting that executable/interpretable code snippets are free to be reused without the rest of the license conditions.

    • This is essentially just the regular expressions; they don't have real value and they originated from me studying HarfBuzz source code but renaming & rearranging things for human-readability. So imposing any additional conditions on them would probably mean first revisiting them all to check that nobody would consider them "derivatives" of HarfBuzz. Can't imagine anyone would, but it wouldn't be worth the time to try and track down any regex-users and find their license terms either.
    • (technically this would also include build scripts in the not-yet-merged Sphinx / image-generating-script forks, when merged, but those are vastly less interesting or special than the regular expressions)
  4. A specific clause noting that incorporating quotations from the document in software does not constitute a derivative work.

    • As in the previous thread message, the main concern is for inline comments; we don't want those to trigger license-combination processes for software implementers.
    • This could be as simple as specifying that partial quotation of verbatim sections only needs to be noted in one place, say, at the directory level.
  5. A specific clause noting that a translation is considered "one single modification" and therefore only needs to be noted/cited once.

    • This is also a way to be simpler than the GFDL: "noting every change" or even "each changed section" for a translated version would not, perhaps, be a gargantuan undertaking, but it'd certainly be a burden, and translators do a hard and thankless task already, so we can at least make life slightly simpler for them here.
  6. [Possibly an anti-clause]: I think it ought to be made clear that the text in any form is covered the same way and to the same degree. That might could be made clear in the wording of (1).

    • This side-steps the "source format" conditions of a lot of software-specific licenses. A big reason I stuck with Markdown (which I don't especially think fits how I would design markup) is that the plain-text files in the git repo are readable and have some minimal level of formatting and structure.
    • I say "anti-clause" because that source-offer is usually specified as its own clause in software licenses. So maybe it ought to be a clause here, but maybe it just ought to be something to keep in mind within the basic-rights clause.
    • (edit: Perl's Artistic License 2.0 explicitly states that you can distribute modifications in any form, source or compiled.)

I think those are the basic set; everyone is welcome to posit others, as well as to react to these....

Separate comment for this: the other top-level concern to think about is what the copyright-like / authorship line would need to look like, since that's the part that would be required to be reproduced in $SOME_DISCLAIMER.

I don't mean whose name goes in it; I'd be happy to just put literally everyone in an alphabetical list. But it is important to disclaim "intellectual property" rights, especially where the edges of the docs meet the outside specifications for reference.

(In fact, I need to go back and add links to the README to reference the various trademarked names of the outside specs referred to. WOFF and WOFF2 do this already, in a "references" appendix, so I think that's the approach.)