Formalize OpenTracing API compatibility practices
bhs opened this issue · 5 comments
For context, the OpenTracing maintainers are making some people a little miserable by pushing out changes without much compatibility-related prep or, in some cases, follow-through.
On the one hand, PRs like opentracing/opentracing-java#115 got plenty of review... that thread was getting so long that Github regularly returned 500 errors to me by the end.
That said, these larger changes also have implications both for (a) those who implement the OpenTracing APIs, and (b) those who call them... the OT-Java 0.30
release had some unceremonious discussion on Gitter but was pushed out (by me – I'll take responsibility for it!) without a meaningful CHANGELOG entry or a "migration guide" for either implementors or callers. (We did add something to the README, though).
There's perhaps an argument about the symbolic importance of a "1.0" release in each language, etc; maybe, but I would argue that the OpenTracing project has enough inbound dependencies at this point that we need to be more careful.
I would like to use this issue as a place to debate what's appropriate. There are three aspects I personally care about, though of course others are encouraged to chime in with others:
- What sort of "prep work" must happen before a backwards-incompatible change to a published OT artifact goes out (e.g., prepping PRs in advance that update all OTSC-member Tracers on day one)
- What sort of CHANGELOG entry and what sort of migration guide we must publish along with the backwards-incompatible OT artifact
- How frequently OpenTracing is even allowed to make backwards-incompatible changes to published artifacts (i.e., per year)
If people have references to processes / practices that have struck a nice compromise for other API projects, it would be great to see them referenced in this thread, too.
cc: @opentracing/otsc @opentracing/otiab @adriancole @tedsuo
Yes, we have serious issues with the concept of backwards-compatibility, since re-linking 1000's of applications in a reasonable amount of time (<than 2 years) is near to impossible. @jquinn47, @SaintDubious might be able to comment and regarding back's compatible changes. My view is simply... the only version where non back'compatible changes can be made is OT 2.0. All 1.0 changes must be backwards compatible. This means that 1.0 is stable and in "maintenance" where (potentially ugly but backwards compatible) workarounds/patches are applied for every new feature while a nice, clean 2.0 is under development. That's how I view it! :)
Thank you to the maintainers, and for the record you haven't made me miserable. Quite the opposite!
If the number of inbound dependencies on an unstable API is becoming problematic, my suggestion is to prioritize achieving stability and shipping 1.0.0. As per semver 2.0:
How do I know when to release 1.0.0?
If your software is being used in production, it should probably already be 1.0.0. If you have a stable API on which users have come to depend, you should be 1.0.0. If you’re worrying a lot about backwards compatibility, you should probably already be 1.0.0.
Would be good to establish the must-haves and a tentative schedule for a 1.0 release, and then as @lookfwd suggests, use semver to communicate breaking changes.
In my experience, maintaining a functional example and linking to the diff is an ideal way of communicating the migration path of an API.
Post 1.0 release, my expectation is that major releases would be made when absolutely necessary, rather than an arbitrary timeline of one year. One year is a great target though.
This, if solved well, will reduce the downsides of OT proliferation. In java, OT are interfaces, which even adding a method (<java 8) breaks implementors leading to a revlock. This manifests as instrumentation or implementors stuck until everything is upgrades. This impacts users because any library pinned to OT may not get other OT unrelated change until all libraries synchronize on a version. Agents start having more complex code needing a matrix of versions. This sort of problem may not be obvious to all, but is the primary reason application libraries repackage internal <1.0 libraries. Repackaging defeats the purpose of OT!
Basically if widespread integration is desired, strict semver and also a policy/culture that feels the pain of change will help remove collateral damage and turn things from liability to benefit. For research on this point, you can look at interfaces that last a long time, for example standard apis (and JSRs that underpin it). You can also look at guava, which OT seems to aim to be used as often as. Guava is a great example of where one library choice can pin an app forever.
One way to handle this is to have compilation integration tests on snapshot versions. Well prior to making a release candidate or next minor, CI on each of the OT maintained integrations would catch any failures, giving people more than a gut feel of compat. This is important also because folks unfamiliar with java can look at those reports vs trusting a "this should work" type of comment on an issue.
Other things is that you can use animal sniffer to check api signatures on OT maintained integrations. Once v1.0 is out, don't immediately use a new method added in 1.1, as that will lead to the same revlock. Ex there are two problems, one knowing that a version will break compat, and also avoiding things that break compat.
Finally, some formalized release notes and broadcast system to users. For example, it matters not if there are 500 comments on an issue that only the inner circle are paying attention to. If something is widely deployed, there needs to be a channel to a wide audience well in advance of an inner circle choosing to break an api.
Hope this helps!
One other thing on this topic is to avail a "user api" which makes it clear what would impact code end users are likely to need. Such an api could limit the scope of damage of OT changes. For example, one that only allows tagging or overriding certain things. We did this in Brave to help isolate a single class that we can watch very carefully.
https://github.com/openzipkin/brave/blob/master/brave/src/main/java/brave/SpanCustomizer.java
I can't read this thread and not think about this talk: https://www.youtube.com/watch?v=heh4OeB9A-c . If you have an hour to spare, this is a great talk in API design, by Joshua Bloch.
In java, OT are interfaces, which even adding a method (<java 8) breaks implementors leading to a revlock.
A common trick for this is to provide an API to the user, and a base implementation as SPI to implementors. This way, if you have to evolve the API, you have to add a new base implementation that is sane enough for current implementations to use, making your code always backwards compatible.
One other thing on this topic is to avail a "user api" which makes it clear what would impact code end users are likely to need.
One tactic used by JBoss projects (or used to be used, at least) is to make it clear by the package name what level of support the API has. For OpenTracing, it could look like this:
io.opentracing.internal
-- not meant to be consumed by anyone, only by OpenTracing itselfio.opentracing.spi
-- meant to be implemented by a service providerio.opentracing
-- meant to be consumed by the general population and implementations