18F/api-standards

Removing version numbers

konklone opened this issue ยท 29 comments

I suggest that version numbers be removed as a standard.

For one, including them in the URL is a clunky way of handling versioning. GitHub's API handles them as HTTP Headers, which I think is a more sane way of handling it: it allows your version strings to be longer and descriptive than one would be comfortable with in a URL, and lets the URL stick to describing where you're at and what you're asking for.

But more than that, I think versioning an API at all is premature optimization for any API that isn't setting out to be like, a 10-year evolving flagship service. If an API is going through a major version change, it's easier in a lot of ways to just set up shop at a new domain, or new path. For a major (as in, backwards-incompatible) new version, clients are going to have to update their code/libraries anyway -- version numbers don't add any value on top of domain names, for all but the most advanced use cases.

Committing to handling a version number, in either the URL (as a path param or a query string param) or in HTTP headers, means you're committing to make each version of the API smart about processing a version number, in some blunt or sophisticated way. That's extra complexity, and usually it's not needed.

So this might actually translate to a Recommendation to not use version numbers, in the absence of a specific long-term plan that implies their utility.

...and also, this could be replaced instead with a higher level discussion of ensuring backwards compatibility and stability when versions do shift (whether numbers are used to manage it or not).

This makes sense to me, although I know @jgrevich has thoughts on versioning and APIs.

Pretty solid article on versioning: http://apiux.com/2013/05/14/api-versioning/

It is pro-versioning, but highlights the downsides and discusses an alternative.

Downsides:

  • "Depending on the size of the user base and the nature of apps using the API any version update can be extremely painful."
  • "Keeping several versions alive at the same time can complicate your code base quite a lot."
  • "Any version change, or for that matter, any major change, in your API will come with increased support costs and a risk to alienate developers."

Alternative:

The alternative to versioning is to not introduce breaking changes. If you need to introduce potentially breaking changes you instead introduce these as additions to the current API. Instead of changing the data a resource type returns you instead introduce a new type of resource and have both resource types live forever side by side. Depending on how your API is used and the rate of change this might be your only solution.

GUI commented

I'm a proponent of API versioning, even if you have no immediate plans to release multiple versions of the API. If you establish your versioning up-front, I think it clarifies your API's contract with the consumer, and also makes the transition a little smoother if you ever need to release a completely new version.

I also don't think versioning has to be complicated. We use URL based versioning, so for the first release all this really means is ensure the API lives under a /v1 namespace in the URL. Otherwise, you don't have to think about versioning. If we do have to release new versions, then it's just a separate app living under a /v2 namespace (if you want your versions to share a codebase, then this might be more complicated, but the coding side of things really seems like like a separate issue, and at least in our experience, new versions are usually different enough to warrant separate code).

In general, I like the White House's original versioning guidelines.

@konklone, you had said:

If an API is going through a major version change, it's easier in a lot of ways to just set up shop at a new domain, or new path. For a major (as in, backwards-incompatible) new version, clients are going to have to update their code/libraries anyway -- version numbers don't add any value on top of domain names

To me, this sounds like you're describing URL based versioning (since you're talking about a new API at a new domain or path). I just like establishing the version from the get-go, so you don't run into awkward transitions if you do ever introduce a new version.

Semi-related to that, I am a proponent of URL-based versioning. I know there are varying opinions on this, but here are a few reasons we've stuck with URL versioning:

  • It's more obvious for API consumers. Particularly for people that are new to APIs, I think URLs are more universal (we've had some new API users that haven't known how to pass HTTP headers in for other cases).
  • It eliminates potential caching issues at various layers (naive caching servers may not respect the differences between versions if they exist at the same URL and browsers have sometimes had issues).
  • It makes it easier to distinguish API version usage in web server access logs or error logs if you're debugging things (the URL is usually always present, headers might not be).
  • It seems to be what most public APIs use. Here's an interesting rundown of what others are doing. The blog post is a bit dated, so maybe things are shifting, but I'd say URL versioning is still what I generally see in the APIs I personally consume.

Our team's basic approach is:

  • Start everything under a /v1 URL namespace that's specific to that API.
  • Never introduce breaking changes on a given API version. We basically follow semantic versioning where the major version is the only thing present in the URL.
  • Try to avoid introducing new versions until absolutely necessary. New versions are a pain for everyone involved, so I won't argue there, but I would argue that a lack of versioning is eventually more painful. And this isn't to say we don't improve our current APIs--we just have to be careful to ensure any changes are done in a backwards compatible way so we don't upset our API consumers (but I think this still applies whether or not you've officially versioned things).
  • If a backwards incompatible change is necessary, then try to make it part of a broader upgrade to the API (so users will actually want to switch), and then build out the new API under a new /v2 URL namespace.
  • When releasing a new API, we try to keep the previous version(s) around for as long as feasibly possible so developers have time to transition. How long that is varies on a case-by-case basis. In situations where we really need to get developers using the new version this might be short (maybe 3-6 months), but in cases where can more comfortably support two versions, then we're letting the previous versions live somewhat indefinitely.

We may be talking past each other a little bit. I'm completely in agreement with the pain of introducing new, backward-incompatible functionality, and to try to prevent that as long as possible, both in practice and in designing the original API.

You're correct that URL based versioning makes setting up a separate app (at a new path) easier than distinguishing based on HTTP headers. And, though GitHub and others use headers, I don't think I'd want to support them as a mechanism myself. So let's drop that as a recommendation in any way.

My major point was that putting a version number in a URL doesn't actually save the API developer or client any work, at any stage.

Whether a /v1 goes to /v2, or the domain name is entirely different (or starts with a sudden v2.domainname.com subdomain) -- all of that means the URL has to get updated by clients or their client libraries. And of course, a new version means new, backwards-incompatible functionality, meaning more client work to comply with the new API (be it in a library update, or through one's own code).

I went through 3 versions of an API at my last job. I actually put /v1 in the URL of each of the first two versions, but for the third one I dropped it altogether. That's because each time, by the time the next version came out, the functionality and branding had changed enough that it had a new name, and merited a new domain name. And if I had wanted to keep the name, a v2. subdomain would have done the trick just fine. But that never happened.

So it really feels to me like premature optimization. Handle versioning when you get to the v2, not with the v1. (Similar to my suggested guidelines for naming World Wars.)

So it really feels to me like premature optimization. Handle versioning when you get to the v2, not with the v1.

I agree with this completely. Don't design stuff before you need it.

I'm not sure we've reached enough consensus here to include a section on it in our released v1, so, barring objection, I'll just remove the section entirely (which right now just says "TBD") and we can tackle this for an update.

๐Ÿ‘

GUI commented

Thanks for the clarifications @konklone. If there's not consensus, punting this topic until later certainly seems reasonable (but I also don't want to be a roadblock on this if I'm the only one that's being ornery).

Just to continue the conversation, I think we're largely on the same page, although I still might make a very minor argument in favor of versioning from the very beginning (but this might be related to types of APIs we focus on). I definitely hear where you're coming from, though, in terms not prematurely optimizing. I guess my main argument in favor of versioning from the start mostly boils down to communication: I think versioning from the beginning makes communicating about future versions a lot easier with your users if/when they get released.

I can probably explain a bit better by detailing my experiences with one of our own APIs that's been through several versions. There are probably plenty of extenuating circumstances that apply to this tale, so this doesn't mean it applies everywhere, but I think this serves as a good example of what's led us to a versioning first strategy.

So.. Meet PVWatts. It's API is currently at version 4. Its history:

A couple important things to note:

  • V1 and V2 didn't actually have version numbers present anywhere in their URL or documentation. I'm just calling them V1 and V2 here for convenience.
  • V1 and V2 ran concurrently for years as somewhat separate projects. There was no communication to previous users that V2 had been released, so users simply used which ever version of the API they stumbled upon.

When we went to release V4, we needed to take down V2 as soon as possible after we got users to switch over. However, the V1 SOAP service has to continue running indefinitely for several key users.

To get V2 users to switch over to V4, we e-mailed all the PVWatts users to tell them about the deprecation of the old API and the need to switch to the new one. Our communication probably could have been clearer, but without easy version numbers to refer to, we were left referring to things by URLs or SOAP vs REST nomenclature. This led to a fair amount of confusion for users during the migration to V4, since some users just didn't understand what this meant for them--some V1 users became concerned their API was being discontinued (it wasn't), while some V2 users didn't think they needed to migrate to the new version (they did). Furthermore, when we did try to clarify things by updating V2's documentation to actually refer to it as V2, this further confused people, since some of the SOAP users were sure they were using V2 of PVWatts (I'm still not entirely sure why, but I think the SOAP service may have undergone its own update years ago that none of us were aware of).

It took about 6 months to get all the V2 users switched over, so during that time, we dealt with plenty of confused API users and questions. It's certainly not guaranteed that it would have gone smoother if all these old APIs had been versioned, but based on the questions I dealt with over that time, it does seem like a lot of this would have been clearer for all our users if API versioning had been established at the beginning. And again, there's probably plenty of other things we could have done to improve this communication process (it was our first real push to get users off an old API), so I'm certainly not saying versioning is the only solution to these kind of communication problems--it just seems like it might have helped.

We're also in the stages of planning V5 of the PVWatts API (yes, V5 already, unfortunately). Purely from a URLs as communications perspective, I'm glad V4 is already versioned in the URL, since I think that it will be easier to understand the relationship between /api/pvwatts/v5.json and /api/pvwatts/v4.json rather than to an ambiguous /api/pvwatts.json had we not versioned V4 on developer.nrel.gov. But I think this also demonstrates that this is the kind of simple API that's remained a single endpoint throughout each version. The API is just one URL endpoint, and there is no other branding beyond PVWatts. So we don't have a lot else that would change in new versions of the API beyond our version number (going forward the domain won't change, there's no branding to change, and there's no other piece of the URL to change since it's a single endpoint).

So anyway, that's my very winding tale of PVWatts. This is obviously a very specific tale, and our experiences there won't translate everywhere, but it has led our team to adopt a version-first strategy that seems to have helped with a couple other API migrations. However, I can also totally understand how these experiences wouldn't necessarily translate to other types of API projects. If everyone's experiences are different in this area and there's no consensus then maybe it doesn't make sense to make this part of the standards, but I thought I'd contribute my take on it. And now I'll finally shutup. :)

No that's super helpful, and after all, I'm going by a personal anecdote here myself. I totally can see why the confusion was difficult to mitigate over version changes, especially with no change in name or branding.

It does seem like the confusion would have been largely mitigated though, if the version number had appeared starting at v2. In this case, it didn't appear until v4, which is maybe, er, post-mature optimization? ;)

I know in my case (at Sunlight), our branding renames were a source of confusion. We had a history like this...

2007 - Sunlight Labs Congress API - (MoCs only)
2010 - Drumbone API - (everything but MoCs)
2011 - Real Time Congress API - (everything but MoCs, deprecated Drumbone)
2013 - Sunlight Congress API (MoCs and everything else, deprecated everything)

Version numbers just wouldn't have gotten us out of that PR hole (a hole I was basically responsible for digging Sunlight in and out of).

But I also wouldn't have wanted us to just stick with a previous name because of tradition. The eventual name and URL was good, even if the process was messy. I may also put a higher priority on aesthetics in URLs than some -- when I do workshops introducing APIs to newbies, it's nice to be able to have extraordinarily simple URLs, like congress.api.sunlightfoundation.com/legislators, that make intuitive sense to people who are also just learning URLs.

I'm very much +1 on versioning from the get go and I'm very surprised to see any argument against it. If we assume that an API will sooner or later have to change, the best way to do that is via versioning, and we should plan on that from the get go. It's not a premature optimization if you know you'll benefit from it later.

I'd much rather have URLs like:

  • /api/v1/post/
  • /api/v2/post/

Than to have subdomains:

  • domain/api/post/
  • v2.domain/api/post/

Or to not have versioning at all.

Eric, you're using the API I stood up, and it's about to get v2. Isn't it better that v1 can stay untouched and that v2 will be at a predictable place? By putting v1 in the URL, isn't it nice that I've provided you some indication of my intention to keep that stable?

I don't know...this is one of those times that I'm upset and feel like SOMEONE IS WRONG ON THE INTERNET, so I should probably just leave it at this, but if we're providing guidelines for APIs, I feel very strongly that there should be versions as a recommendation.

Eric, you're using the API I stood up, and it's about to get v2. Isn't it better that v1 can stay untouched and that v2 will be at a predictable place?

I don't see why. I have no need to predict what the URL or requirements will be for v2 when I integrate against v1. When v2 is released, I'm going to have to come back and revisit your documentation, see what's changed, update my client lib, etc. If that means api.courtlistener.com has become v2.api.courtlistener.com, that makes no difference to me in API usability compared to /v1/ becoming /v2/. The difference is that in the former scenario, your API v1 URL is cleaner.

By putting v1 in the URL, isn't it nice that I've provided you some indication of my intention to keep that stable?

I expect a promise of stability and backwards-compatibility no matter what you put in the URL. ๐Ÿ˜„

I see a lot of /v1/'s out there, and very few /v2/'s or /v3/'s. My argument isn't really against the idea of API versioning, but against bothering to plan for it ahead of time, because there's really no downside in waiting.

The difference is that in the former scenario, your API v1 URL is cleaner.

And that makes a better, more professional, API that you trust more, right? Something that we'd want to recommend?

My argument isn't really against the idea of API versioning,

That's a relief. I confess I misunderstood that at first.

but against bothering to plan for it ahead of time

But the planning is super easy: Just call it "v1". And the result is a better API with more consistent, more professional URLs. Anyway, if we agree that things should be versioned eventually, isn't that an argument for including recommendations about for when it would eventually come up?

There's also an argument that making people thing about future versions is a good thing when they're developing their API the first time.

The difference is that in the former scenario, your API v1 URL is cleaner.

And that makes a better, more professional, API that you trust more, right? Something that we'd want to recommend?

It doesn't appear more professional or trustworthy to me -- to me, it just adds character noise and makes it look more sophisticated.

But the planning is super easy: Just call it "v1". And the result is a better API with more consistent, more professional URLs.

It doesn't make it a better API, it keeps URLs consistent across versions -- which I just don't think matters very much. And every unneeded parameter in a URL makes it longer to type and tougher to understand. (I feel this way about API keys, too.)

Anyway, if we agree that things should be versioned eventually, isn't that an argument for including recommendations about for when it would eventually come up?

Maybe, but headers vs subdomains vs URL path sort of stuff seems pretty application- and context-specific. I see it as more of a PR concern than an engineering concern.

This issue is being referenced in other tickets lately, so I'd like to say that I myself am +1 on including versioning in the URL from day 1. The perceived problems of a few extra characters in the endpoint url are outweighed by the communication, transparency, and future-proofing benefits of including the version number. I am surprised this is a controversial topic.

I was not previously familiar with the White House's versioning guidelines, but as @GUI said I like them.

I like that way of thinking about it, @hollyallen. Including versioning from day one indicates to an API user that it is something you are thinking about, and that they should keep it in mind as they design and maintain their app, too.

I agree that an API should be versioned, but I disagree with declaring an API to be at version 1 from day one because that goes against semantic versioning. If the API is still in development and is not ready for production use yet, then it should not be declared v1. Otherwise, if you are declaring that you are following semantic versioning, you will end up at v5 or more by the time you are ready to release the API due to the breaking changes you are likely to introduce while initially developing the API.

I also agree with @konklone that specifying the version via the Accept header is preferable than making it part of the URL because the URL, which should represent a resource, should not change. It's not the version of the resource that's changing. Just like we use HTTP verbs for actions on resources, using the Accept header is the analogous way for a client to request a particular representation of the content.

Another versioning method I've seen in some APIs is to allow the client to specify the version via a request parameter, which also sounds reasonable to me. The Google Maps Javascript API is an example.

In his article on API versioning, Robbie Clutton from Pivotal Labs says:

Perhaps using the Accept header makes you a better denizen of the Internet, adhering to HATEAOS principles but I think using a version request parameter makes for a better Web experience. As a developer, I want to be able to put a URI into a browser and see a response rendered and for that Iโ€™d lean more towards the version parameter.

The bottom line is that there is no universal standard, and neither method is wrong. It boils down to preference. Here is how some popular APIs are versioned:

In the URL:

  • Facebook
  • Twitter
  • YouTube
  • AccuWeather
  • LinkedIn
  • Pinterest
  • Tumblr
  • Yelp

Via Accept Header:

No versioning:

  • Flickr
  • Slack
  • Code Climate

Via request parameter:

  • Google Maps

Inconsistent versioning:

  • Reddit (some URLs contain a version, most do not)

Here is another interesting post on API versioning:
http://urthen.github.io/2013/05/16/ways-to-version-your-api-part-2/

I agree that an API should be versioned, but I disagree with declaring an API to be at version 1 from day one because that goes against semantic versioning.

@monfresh What would you think of declaring v0 to start? Would that address your concerns?

@arowla Sounds reasonable to me.

GUI commented

@arowla beat me to it, but yeah, if I'm publicly releasing an API that's still under development, I version it as v0, which I think helps communicate the beta nature of it.

Regarding versioning in URL paths vs Accept headers vs query string parameters, as I think I've already rambled about before in this thread, I like URL paths. But any of those are certainly valid versioning options. In any case, I agree URL paths might not technically be the most correct way to version things, but I echo the sentiments of "as a developer, I want to be able to put a URI into a browser and see a response rendered."

In comparison to a query string param, I think I still prefer keeping it in the URL path, because it gives an obvious root URL to each version. This may not be as valuable for all types of APIs, but if there are multiple endpoints that make up your API, I think having a root prefix can be a more obvious way to group all of the endpoints that will be versioned together and should be used together (if it's a query string parameter, it seems like this can become a bit lost or easier to mis-match versions).

But as you said, @monfresh, it boils down to preferences, and there are obviously lots of preferences. :)

"as a developer, I want to be able to put a URI into a browser and see a response rendered."

+ ๐Ÿ’ฏ

I was thinking about this, as well. In the case of non-URL versions, this could be mitigated by the API making the most recent version the default, but that risks serving up the wrong version to unsuspecting users of the previous version who might not have specified the version in the header or query param. That seems like a less desirable situation than just putting it out there, plain and simple, in the URL.

@gbinal As always, our standards should recommend the best thing for the user. Have issues around API versioning ever come up in your API usability testing?

For completeness, I also agree with following semantic versioning standards and using v0 instead of v1 when the API is still in heavy churn. Sorry I wasn't more clear about that in my initial comment.

I'm curious, at this point is anyone arguing against versioning from day 1 at this point? @konklone, your last comment was over a year ago.

Have issues around API versioning ever come up in your API usability testing?

Yes. Generally, they are appreciated and encouraged and are agnostic between method (in URL vs. in the header). FWIW, I haven't heard anyone complain about their being in the URL.

@konklone - I respect your earlier point about avoiding premature optimization but one thing to weigh is the natural distrust that developers have with gov't APIs. There's evidence and stories of agencies not maintaining them and developers have spoken of appreciating seeing versioning in a newly released API as a sign that the agency understands developer needs and will take them into account in the future.

I don't think there's any need for us to formulate a norm as to how to do versioning. I would be in favor of adding to our standards some form of versioning or some communication to the developers about how future updates will be communicated and handled, to put their minds at ease.

Interesting point about "developer needs". I think I'm liking the v0 starting point. Maybe v0 could always exist as the bleeding edge, from which v1, v2, ... vn will eventually emerge, if need be.

Doing this also seems perfectly compatible with the advice to avoid breaking changes in general. Could be v0 is all we'll ever need, or v0 and v1 could remain functionally identical if we need a v1 label to provide confidence.

I also kinda prefer URL schemes to headers, at least for the Team API use case.

Ok, I think we're converging on a standard, here, folks!

Versioning from day 1, 'v0', with a slight preference for URL over HTTP Header for usability. Unless there are objections, somebody want to start a PR?

I'm curious, at this point is anyone arguing against versioning from day 1 at this point? @konklone, your last comment was over a year ago.

Well I still feel that way, but the consensus across the team is very clearly to do version numbering from day 1, so ๐Ÿ‘ to a PR.