/WebID

A privacy preserving federated identity Web API

Primary LanguageCSSOtherNOASSERTION

title created updated redirect_from
WebID
01/01/2020
03/02/2021
index.html

not to be confused with this WebID whose authors have graciously allowed us to use this as a codename until we find a better one

TL;DR; This is an active exploration to react to the ongoing privacy-oriented changes in browsers (e.g. 1, 2 and 3) and preserve identity federation (e.g. OpenID, OAuth and SAML) on the web.

The Problem

Over the last decade, identity federation has unquestionably played a central role in raising the bar for authentication on the web, in terms of ease-of-use (e.g. passwordless single sign-on), security (e.g. improved resistance to phishing and credential stuffing attacks) and trustworthiness compared to its preceding pattern: per-site usernames and passwords.

The standards that define how identity federation works today on the Web were built independently of the Web Platform (namely, SAML, OpenID and OAuth), and their designers had to (rightfully so) work around its limitations rather than extend them.

Because of that, existing user authentication flows were designed on top of general-purpose web platform capabilities such as top-level navigations/redirects with parameters, window popups, iframes and cookies.

However, because these general purpose primitives can be used for an open ended number of use cases (again, notably, by design), browsers have to apply policies that capture the lowest common denominator of abuse, at best applying cumbersome permissions (e.g. popup blockers) and at worst entirely blocking them (e.g. blocking third party cookies).

Over the years, as these low level primitives get abused, browsers intervene and federation adjusts itself. For example, popup blockers became common and federation had to adjust itself to work in a world where popups blockers were widely deployed.

The challenge, now more than ever, is that some of these low level primitives are getting increasingly abused to allow users on the web to be tracked. So, as a result, browsers are applying stricter and stricter policies around them.

Publicly announced browser positions on third party cookies:

  1. Safari: third party cookies are already blocked by default
  2. Firefox: third party cookies are already blocked by a blocklist, and
  3. Chrome: on iOS already blocked by default and intends to offer alternatives to make them obsolete in the near term in other platforms.

Blocking third party cookies broke important parts of the protocols in those browsers (e.g. logouts) and made some user experiences inviable (e.g. social button and widget personalization).

While it is clearer to see the current impact of third party cookies, it is equally important to understand the ways in which the low level primitives that identity federation depends on (e.g. redirects) are being abused and the principles browsers are using to control them, so that we don't corner ourselves into another dead end.

If browsers are applying stricter policies around the low level primitives that federation depends on, and under the assumption that federation is significantly better than usernames/passwords, how do we keep identity federation around?

Third Party Cookies

The problem starts with what we have been calling the classification problem.

When federation was first designed, it was rightfully designed around the existing capabilities of the web, rather than changing them. Specifically, federation worked with callbacks on top of cookies, redirects, iframes or popup windows, which didn't require any redesign, redeployment or negotiation with browser vendors.

One example of a low level primitive that federation depends on are iframes and third party cookies. For example, credentialed iframes are used while logging out and social button and widget personalization.

Unfortunately, that's virtually indistinguishable from trackers that can track your browsing history across relying parties, just by having users visit links (e.g. loading credentialed iframes on page load).

We call this the classification problem because it is hard for a browser to programatically distinguish between these two different cases: identity federation helping a user versus users being tracked without any control.

Third party cookies are already blocked in Safari and Firefox by default (and Chrome intends to block that soon too) which make these use cases inviable.

The problems then are:

  1. First and foremost, what Web Platform features need to be exposed to (re) enable these features of federation to co-exist with the absence of third party cookies in browsers going forward?
  2. Secondarily, in which direction browsers are going that could potentially impact federation?

Navigational Tracking

Before we prematuraly jump into solutions for the first (and more urgent) problem, we think there is something more fundamental changing. Lets take a step back and a closer look at the second problem: in which direction browsers are going that could more fundamentally impact federation?

While third party cookies in iframes are used in federation, a more fundamental low level primitive that federation uses is the use of top level navigations (e.g. redirects or form POSTs) to navigate the user to identity providers (with callbacks, e.g. redirect_uri) and back to relying parties with a result (e.g. an id_token):

However, unfortunately, this low level primitive also enable cross-site communication, namely via decorating links, which can be abused to track users without their control in what's called bounce tracking:

In this formulation of bounce tracking, websites redirect the user to cross-origin websites that automatically and invisibly redirect the user back to the caller, but passing enough information in URL parameters that allows the tracker to join that visit (e.g. when you visit rings.com) with visits in other websites (e.g. when you visit shoes.com).

In federation, that's less invisible/automatic, but it is still there. Cross-site tracking is enabled via federation when relying parties that the user signs in to collude with each other (and other entities) to deterministically (or probabilistically) link their user's accounts to build and get access to a richer user profile (e.g. one site selling data on browsing history for ads targeting to another service). While this could be enabled without federation per se (user could manually provide a joinable email address or phone number), federated identity providers have an opportunity to address this problem at scale by providing their users with site-specific/directed identifiers.

Because of these tracking risks, browsers are starting to disable third party cookies in iframes and more generally provide tighter control over cross-site communication (e.g. a privacy model for the web).

Because these cross-site communication takes place in a general purpose medium, it is hard for browsers to distinguish between cross-site communication that is used for exchanging identity data deliberately (e.g. federation) or unintentionally (e.g. tracking).

Browsers can't classify federation, hence the name.

The classification problem is notably hard because it has to deal with adversarial impersonation: agents who have the interest in being classified as federation to get access to browser affordances.

While the timeline for link decoration is much farther in time, it much more fundamentally threatens federation.

Publicly announced positions by browsers on bounce tracking:

So, how do we distinguish federation from tracking and elevate the level of control while assuming adversarial impersonation?

Proposal

Clearly, this is a massive, multi-agent, multi-year problem across the board.

There aren't any easy solutions and most of the answers come in the form of alternatives with trade-offs.

There are billions of users that depend on federation on the web, millions/thousands of relying parties and thousands/hundreds of identity providers. There are also tens of browsers and operating systems, all moving independently. None of that changes overnight and we don't expect it to.

The specs that define how federation works (e.g. OpenID, SAML, etc) are intrincate and long (e.g. session management, authorization, etc).

Having said that, failing to be proactive about affecting change and making federation forward compatible with a more private Web can steer users to less secure patterns, like usernames/passwords or native apps.

The approach we have taken so far has been a combination of two strategies:

  • a firm and principled understanding of where we want to get
  • a well informed, deliberate and pragmatic choice of what steps to take us there

We believe a convincing path needs to have a clearly defined end state but also a plausible sequencing strategy.

Sequencing

While much of the environment is changing and evolving as we speak, there are concrete flows that are inviable right now and enough signals about the principles and challenges ahead of us.

Much of this is evolving quickly and we are adapting as we learn, but here is our best representation of how we expect features to be developed:

Stage Timeline Description
Stage 0 2020 Understanding of the problem and properties of the end state
Stage 1 2021 dev trials in Q1/Q2 (instructions) and origin trials in Q3/Q4 of alternatives to third party cookies
Stage 2 2021+ origin trials of alternatives to top level navigation
Stage 3 2021++ other related problems and opportunities

Stage 1: Third Party Cookies

The more urgent problem that clearly has already affected federation (or is about to) is the blocking of third party cookies. We plan to tackle this first:

  • Why, What and When? Today, third party cookies are blocked on Safari and Firefox. They are in the process of becoming obsolete in Chrome in the foreseeable future.
  • So What? logging out, social buttons and widgets personalization breaks. (anything else? add your use case here)
  • Ok ... Now What? Here are some early proposals on how to preserve these use cases.
  • Who and Where?: Browser vendors, identity providers, relying parties and standard bodies are involved. The discussions so far have happened at the WICG and at the OpenID foundation.

Stage 2: Bounce Tracking

Bounce tracking comes next. It is a more evolving situation, but has much more profound implications to federation:

  • Why, What and When? Safari's periodic storage purging and SameSite=Strict jail, Firefox's periodic storage purging and Chrome's stated privacy model for the Web.
  • So What? Purging or partitionig storage across redirects / posts forces users to re-authenticate at each transition of federation flows, at best defeating the convenience that federation provides and at worst making it less secure (anything else? add your use case here.)
  • Ok ... Now What? Here are some early proposals on how to preserve these use cases.
  • Who and Where?: Browser vendors, identity providers, relying parties and standard bodies are involved. The discussions so far have happened at the WICG and at the OpenID foundation.

Stage 3: Future Work

There are a series of related problems that affect federation that we believe we have a unique opportunity to tackle as a consequence of the choices that we make in stage 1 and 2.

These are key and important problems, but a lot less urgent, so we are being very deliberate about when and how much to focus on them.

How can I help?

At the moment, we are actively working with the browser and the identity ecosystem to help us determine product requirements (contribute here with the list of use cases), ergonomics and deployment strategies that minimize change and maximize control, for example via testing our APIs (instructions) and giving us feedback.

Much of this explainer is evolving as a result of this field experimentation. The most constructive/objective way you can help is to:

  1. get a good understanding of the why: understand the ongoing privacy-oriented changes in browsers (example) and their principles
  2. help us understand what: contribute here with a use case that you believe can be impacted
  3. help us understand how: help us discover options (for cookies and navigations) and evaluate their trade-offs. Try the APIs under development and help us understand what works / doesn't work.

Deep Dives

The following should give you a deeper understanding of the problem, related problems and how they were tackled in the past:

With that in mind, lets take a closer look at what high-level APIs could look like for each of these two passes: