privacycg/storage-partitioning

Clear-Site-Data for partitioned storage can be used for cross-site tracking

johnwilander opened this issue · 20 comments

Back when WebKit considered whether or not to implement Clear-Site-Data, we noted that clearing partitioned data upon receiving that header can be used for cross-site tracking purposes. Since not many others were considering partitioned storage at the time, we never filed issues about it, at least not that I'm aware of.

The attack is about one first party site having control over website data under another first party site.

Imagine site.example registering these 33 domains: haveSetPartitionedData.example and bucket1.example through bucket32.example.

site.example runs script in the first party context on a great many websites. As part of its execution on those sites, it injects 33 invisible iframes for the domains mentioned above.

Let's say site.example is executing its script on news.example. If a cross-site user ID has not yet been planted yet for news.example, the haveSetPartitionedData.example iframe will not have website data yet and communicates to the bucket1.example through bucket32.example iframes to start fresh. The bucket1.example through bucket32.example iframes all store '1' in their partitioned storage and report back to the haveSetPartitionedData.example iframe when they are done. Now the haveSetPartitionedData.example iframe stores the fact that 32 '1's have been stored in the news.example partiton.

Every time the user visits site.example, site.example gets to see its unpartitioned cookies which identifies the user. Let's say it uses a 32-bit ID for the user. It now makes sure to send Clear-Site-Data response headers matching the '0's in the unpartitioned cookie ID for the corresponding bucket domains. For example, let's say the user ID has '0's in bit 4, 6, and 20. Then site.example would make sure website data is cleared for bucket4.example, bucket6.example, and bucket20.example.

Now when the user visits news.example, the haveSetPartitionedData.example's iframe will have website data set and communicates to the bucket1.example through bucket32.example iframes to report their '1's and '0's (no website data means '0') to the site.example script on news.example.

Voilà, cross-site user ID established.

Only accepting Clear-Site-Data from the current first party website would mitigate this attack but not fix it. Further, if this attack is combined with browser/device fingerprinting, it only needs to add enough cross-site bits to reach ≈32 bits in total.

A couple questions.

  1. How does the server know which client to send the clear-site-data header to for each bucket frame domain?
  2. How is this different than the server communicating state to each bucket frame through cookies?

It seems to me if the server can know to send a clear-site-data header for an iframe request it could know to send a cookie header.

Edit: Or know to respond to an XHR with equivalent state.

A couple questions.

  1. How does the server know which client to send the clear-site-data header to for each bucket frame domain?

When the user is on site.example as first party website, it makes three requests:

  • bucket4.example/?command=respondWithClearSiteData
  • bucket6.example/?command=respondWithClearSiteData
  • bucket20.example/?command=respondWithClearSiteData

… to which those servers respond with a Clear-Site-Data header to set zeroes for bit 4, 6, and 20 in all partitions at once.

  1. How is this different than the server communicating state to each bucket frame through cookies?

I'm assuming that cookies and website data is partitioned. That's the premise.

It seems to me if the server can know to send a clear-site-data header for an iframe request it could know to send a cookie header.

Edit: Or know to respond to an XHR with equivalent state.

When the user is on site.example, site.example can only affect website data for itself and third parties in its partiton. However, if Clear-Site-Data clears data in other partitions, site.example can affect data cross-site, which is why this can be turned into a cross-site tracking vector.

bucket4.example/?command=respondWithClearSiteData

So this is link decoration then; albeit with only one bit of entropy.

I'm assuming that cookies and website data is partitioned. That's the premise.

Yes, but the server could still respond with a cookie when it sees your link decoration. The cookie would be stored in the partition cookie jar. Then the cookie state could be queried the same way you propose above (I assume with postMessage). It doesn't seem like clear-site-data is needed at all in this case?

However, if Clear-Site-Data clears data in other partitions, site.example can affect data cross-site, which is why this can be turned into a cross-site tracking vector.

I'm sorry, but I don't understand. Above you had the iframes using link decoration to get the header added in their own partitioned context. I don't see where clear-site-data across partitions is coming in?

I do agree clear-site-data affecting across partitions would be an information leak, but is that spec'd or implemented anywhere?

@mkruisselbrink explained to me that the link decoration is on XHR subresource requests. The issue makes more sense to me now. Sorry for my confusion.

It does seem clear-site-data should not cross partition boundaries.

FWIW, I am told chrome does not honor clear-site-data on 3rd party subresource requests today. It seems the spec does support it, though.

bucket4.example/?command=respondWithClearSiteData

So this is link decoration then; albeit with only one bit of entropy.

No, it has nothing to do with link decoration. The URL on bucket4.example.com can be anything, and can be a fixed value. The point is that you establish 32 1-bit values which can be read in third-party context from all partitions.

I'm assuming that cookies and website data is partitioned. That's the premise.

Yes, but the server could still respond with a cookie when it sees your link decoration. The cookie would be stored in the partition cookie jar. Then the cookie state could be queried the same way you propose above (I assume with postMessage). It doesn't seem like clear-site-data is needed at all in this case?

I don't think your understanding of the attack matches what John is outlining. It's not link decoration. John happened to use a URL with a '?' in it, but that doesn't have to be the case.

However, if Clear-Site-Data clears data in other partitions, site.example can affect data cross-site, which is why this can be turned into a cross-site tracking vector.

I'm sorry, but I don't understand. Above you had the iframes using link decoration to get the header added in their own partitioned context. I don't see where clear-site-data across partitions is coming in?

That's not what is happening. Let me explain a slightly simpler version in more detail. Imagine each bucketN.example supports three URLs:

bucket1.example/read --> returns an observably different result depending on whether a cookie is set (in the current partition); this doesn't need an iframe, it can be an image that's selectively either 1x1 or 2x2.
bucket1.example/set --> responds with a Cookie header, thus setting the cookie in the current partition only (since this is in a context of partitioning).
bucket1.example/clear-all --> responds with a Clear-Site-Data header, thus clearing its cookie in all partitions.

Now, imagine social.example wants to abuse servers supporting these operations to link user ID across sites without the user's consent. Let's say social.example is a very popular first party visit, and is also embedded in an iframe on many sites.

User visits news.example/article, which embeds social.example/widget in an iframe. The iframe checks for a Social-User-ID cookie, which would read from social.example's partition under news.example. If it's set, then it already has the user ID, and user identity has been linked cross site. Game over. So let's say it's not. Then it loads resources bucket1.example/read through bucket32.example/read. Are they all 0 or all 1? If not, then use that as the user ID, and save in the Social-User-ID cookie. If they were all 0, user ID is not yet set in this partition, so load bucket1.example/set through bucket32.example/set. Now the bits are all 1.

Later, the user visits social.example directly, where they are logged in. social.example retrieves a 32-bit user ID from a cookie. For all bits N in that user ID that are 0, it loads bucketN.example.clear-all. Because that operation clears in all partitions, it's now made the bits as read from the news.example partition reflect the bits of the user ID.

On the next visit to news.example, let's say news.example/video, there's another social.example/widget embedded in an iframe. It follows the same process as before. Now it sees a user ID that's not all 0 (never visited this site before) or all 1 (haven't yet been back to social.example as first party). So it assembles the bits and saves the user ID in the Social-User-ID cookie in the news.example partition. User identity has now been linked across sites, without the need for any collusion beyond an iframe embed.

In summary, a Clear-Site-Data header that affects all partitions allows state to be broadcast into all partitions with some setup, and thus enables passive cross-site tracking. There is no link decoration! There was never a direct link from news.example to social.example in my example above. All the loads use fully generic URLs that do not contain a user ID.

I do agree clear-site-data affecting across partitions would be an information leak, but is that spec'd or implemented anywhere?

Having read this thread, I'm missing an explanation for:

Only accepting Clear-Site-Data from the current first party website would mitigate this attack but not fix it.

Having read this thread, I'm missing an explanation for:

Only accepting Clear-Site-Data from the current first party website would mitigate this attack but not fix it.

The attacker would have to navigate the user to or open popups for on average 16 bucket domains to set the zeroes in those partitions. 16 because it’s half of 32 in the 32 bit user ID.

For tracking at scale, this would have to be done continuously, for instance once a day or week to set the zeroes in the partitions of any new websites the user has visited.

Is that assuming that it would also clear the partitioned data of that origin?

Is that assuming that it would also clear the partitioned data of that origin?

Right, that is the issue. If Clear-Site-Data clears for all partitions, or can do so, it opens up for this attack.

Therefore, Clear-Site-Data must operate either

  1. within a single storage shelf or
  2. across the shelves keyed by the same top-level site (with varying second-level keys).

Right? I suspect (2) is roughly right, so that having foo.com send Clear-Site-Data will also clear all the partitioned storage for iframes nested inside it.

Is there any disagreement about the desired state for this, or is it just that specs need to be updated to use the terminology about keying that this Work Item hasn't yet added to the Storage spec?

I'm not aware of disagreement on making a change, it just needs to be specified (with appropriate tests, ideally). Storage spec does not yet provide the right infrastructure for this, but the Clear Site Data spec does not currently have a dependency on Storage.

It seems right to me that either (1) or (2) from #11 (comment) would avoid this vulnerability.

Note also: there's a proposal to add an API that does something similar to the Storage Living Standard, care must be taken to avoid the vulnerability in that case as well.

The Storage Standard will take over part of the definition of Clear-Site-Data. The plan to deal with partitioning there is through the storage key. If we go with 2 above that might require some awkward lookups though as you'd have to go through all the keys, but nothing prose can't handle. (And I think we want partitioned and non-partitioned data to be next to each other without some kind of hierarchical relationship between them to avoid subtle leaks, even though sometimes it might make sense to perform hierarchical operations on them.)

whatwg/storage#88 discusses how Clear-Site-Data might end up working. My idea was also that if that replaces with an empty box, we could use the same setup for migrating to non-partitioned data by replacing with non-partitioned data.

@bakulf and I discussed how Clear-Site-Data could work in a partitioned world.

I'm not comfortable allowing site A to remove storage of site B, even if site B is partitioned under A. This is a side channel that A can exploit to interfere with code running in B.

As such, I argue that 1 in #11 (comment) is the best answer for "storage" and I believe an equivalent answer for "cookie" can be found (we really need a more formalized cookie standard). 1 would still allow B to clear storage if it so desires (if it's partitioned under A, only those partitioned-under-A bits would be cleared).

Unfortunately, it is much less clear what a good answer for "cache" is and my current position is that we remove that, as we "did" with "executionContexts" before (see w3c/webappsec-clear-site-data#59). A possibility might be that we origin-match on cached URLs within the scope of the top-level site, which is how it works today but adjusted for a partitioned world, but that is a rather inelegant operation. (From the various types to clear "cache" does not have a strong motivation outlined in the document either.)

@jkarlin mentioned that "cache" was problematic for Chrome too and would get back with details.

Given the side channels, there was agreement on to restricting clearing of "storage" and "cookie" to a storage shelf and whatever will be equivalent for cookies.

A concern that was raised is that this will not permit a site (really an origin) to clear all the bits related to it in the user agent without assistance from the user (and even then it would depend on available UI). I.e., a site (really an origin) can only clear its non-partitioned storage/cookies or a single top-level-bound instance of partitioned storage/cookies through this header.

@jkarlin any update here?

It's a slow operation in Chrome today. We don't keep the URL in the in-memory index (just a hash of the url), and have to open each entry to get the full URL and decide if we want to delete the file or not (if it matches the site to delete). We could improve things by adding the site (or partition key) to the index, at the expense of more memory to optimize for this one feature which I'd rather not do unless it saw a lot of usage.

I agree on clearing at the shelf (partition) level as opposed to globally due to the side-channel.

I can't speak to whether clearing the cache should be included in clear-site-data. I don't know how many sites are using it today and if they're counting on that cache clearing.

Thanks, it's highly likely Firefox will remove "cache" given the problems it poses and the lack of advocacy for its support. That Safari doesn't yet implement Clear-Site-Data also helps. For "storage" and "cookie" we plan to function as I outlined above.

Over the past 28 days, cache was present in ~32% of the Clear-Site-Data calls on Android Chrome's stable channel (according to Navigation.ClearSiteData.Parameters, which unfortunately is neither public, nor tied to page views in any meaningful way). It might be worth digging into usage a little more before punting it (I haven't looked at any of this in a while, but I recall cached data being a big part of what Photos wanted to use this for).

Josh is quite right that it's slow, however: Navigation.ClearSiteData.Duration shows a suspiciously-similar ~32% of clearing operations taking more than a second. Unfortunately that's where the histogram tops out, so I can imagine it's slow indeed.

As an update, Firefox has unshipped "cache" support (see https://bugzilla.mozilla.org/show_bug.cgi?id=1671182) and thus far there have been no reports of breakage.