Fenced Frames
jkarlin opened this issue · 10 comments
Web security has traditionally focused on preventing unauthorized leakage of information across origins. We've seen browsers go so far as to put different sites into different processes to prevent that leakage.
But what about preventing leakage of any information? Even authorized? What if you want to embed a cross-origin document in your page and you don't want it to be able to talk to any other frame on your page, or even to its own origin on other pages?
Why might you want to do that? Well, let's say that you wanted to give sensitive information about your user to a cross-origin frame to display something useful, but you don't trust that site with the data (e.g., it might want to leak it to storage, or another context on the browser). For instance, the user’s finances could be fed to a retirement or mortgage calculator, without the third party being able to store that data or relay it to anyone else on the page. Or, perhaps the browser itself has sensitive information that it could provide to such a restricted frame (e.g., cross-site data like isLoggedIn, the chosen ad of a lift experiment, an ad from FLEDGE, or possibly even cross-site storage access).
From a web security perspective, we'd now have to include collusion in the threat model. Entities within the fenced frame should not be allowed to communicate with those outside.
The Fenced Frames explainer represents our initial stab in this direction and we’d like to build on it together with the Privacy CG. The current draft is primarily focused on the use cases of embedding ads where we feel that it’s acceptable to allow some data to leak via navigation, but only on click. We envision future modes as well where the fenced frame is allowed access to any data, but is never allowed to communicate or only through approved channels.
As a rough outline, the basic features of a fenced frame are the following:
- It has a separate frame tree from its embedder.
- The frames within the fenced frame tree can communicate with each other as normal, and appear to have their normal origins.
- The frames within the fenced frame cannot communicate with any frames outside of the fenced frame. e.g., no external broadcastChannel, storage, shared caches, postMessage, etc. This also means that we need to limit the possible sizes of the frame, ignore frame resizing, and carefully consider information that could be passed from permissions policies. Some bits will leak in this way, but we will constrain them as much as possible.
- The fenced frame is not allowed network access (it leverages navigable web bundles instead). We're considering an exception for origins that have declared that they will only provide caching services and will not log user activity.
- The fenced frame can navigate on click. This allows for a leak of data and must be scrutinized further. The leak is at least limited to user clicks. The URL of the navigation may be required to be specified at
fenced frame
creation. - The fenced frame can be given an opaque url (e.g., urn:uuid) that the embedder can't interpret but maps to a full URL within the fenced frame. This is how we intend to allow APIs to choose content for a fenced frame, which is returned to the embedder in the form of a urn:uuid, and is passed on to the fenced frame for display.
Thanks for taking a look!
Fenced frames as described in the explainer would defeat the protections browsers have created against third-party tracking. How? It’s a combination of two proposed features:
- They have a src parameter that allows an arbitrary URL to be specified. That arbitrary URL can communicate arbitrary information into the fenced frame. Note that this is not limited to query parameters. A server set up for this purpose could use an encoding of information in the path or hostname.
- The fenced frame has access to third party storage. It can store that passed in info and link it cross site.
A fenced frame can also be created by a third-party script without fully informed consent of the site.
The privacy considerations section alludes to these problems but does not propose a realistic solution. It suggests unspecified link decoration restrictions. But, while it can be presumed normal sites will not be compatible with hiding info in the path or hostname from an outside link decorator, we can’t assume that of these fenced frames.
I’d object to Privacy CG adopting fenced frames as a work item in its current form, as it largely defeats protections against third-party stateful tracking, and enables violation of the Chrome Privacy Sandbox principle that disallows cross-site linkage of user identity. It may be that these problems can be adequately mitigated away, but I’m skeptical, and the explainer in its current form does not plausibly address this ver serious privacy hole.
I guess I’ll add that fenced frames without third party storage access would potentially be a tool for sites to improve their privacy. But as proposed, the privacy risk outweighs the potential privacy reward.
There seems to be some confusion regarding the src and mode of fenced frames that we are proposing here. The fenced frame mode that we are working on currently for use cases like interest based advertising and conversion lift measurement disables cross-site information joining because of the following:
- the src of the frame is an opaque urn:uuid, the embedding page cannot inspect it to get any information regarding the interest group or experiment group of the user. Since this urn:uuid needs to match what the browser has internally stored, the publisher cannot modify it by appending any user identifying bits. This is described in these sections in the explainer repo: 1, 2 and in the linked design doc.
- Additionally, in this mode the fenced frame does not have access to unpartitioned storage access. Examples of user data it has access to and that needs to be protected is the interest group/ experiment group of the user in FLEDGE and conversion lift measurement use cases, respectively.
Agree with the privacy concerns if there were an arbitrary url and unpartitioned storage access but that's not the proposed design. If that were allowed, there definitely needs to be additional significant mitigations and those need to be discussed and solved separately from the mode described above.
the src of the frame is an opaque urn:uuid, the embedding page cannot inspect it to get any information regarding the interest group or experiment group of the user. Since this urn:uuid needs to match what the browser has internally stored, the publisher cannot modify it by appending any user identifying bits.
Th example in the spec is inconsistent with the claim that the src of the frame is (always) an opaque urn:uuid
Additionally, in this mode the fenced frame does not have access to unpartitioned storage access. Examples of user data it has access to and that needs to be protected is the interest group/ experiment group of the user in FLEDGE and conversion lift measurement use cases, respectively.
What's this section of the Explainer about then? It's titled "Unpartitioned Storage Access" https://github.com/shivanigithub/fenced-frame#unpartitioned-storage-access
Per your explanation it sounds like maybe fencedframe is something that only the UA could create and set the src
for, and that it would be an opaque URL that's not controllable by the embedding page. Is that the intent? If so, then:
- Fenced frames would not be useful outside of TURTLEDOVE and similar proposals, and would not have independent privacy value
- The explainer doesn't make it at all clear that this is the intent.
On the other hand, if fencedframe is intended both for use by the UA and use by websites, then the threat I presented still applies, even if one intended use of fenced frames would use opaque URLs.
It would be helpful to think of fenced frames having separate modes, for instance:
-
The mode discussed in my earlier comment where the src is a browser mapped urn:uuid and no unpartitioned storage/cookie access is given except for the use-case specific data e.g. interest group/ experiment group etc.
-
Another mode could be one where src is any url but the fenced frame does not have write access to storage or any network access to prevent any exfiltration of cross-site joined data.
-
Another mode is where src is any url but one that is guaranteed to not be user identifying. In that case the fenced frame could get access to unpartitioned storage. The section on unpartitioned storage refers to mode (3) and the challenges mentioned there do not yet have a proposed solution.
All of these modes are similar in the way the fenced frame tree and the rest of the page cannot access each other or communicate via postMessage etc. but they differ in the type of protections they need in order to maintain the privacy goal of not joining and exfiltrating cross-site data.
Does the explainer document these disjoint modes? Maybe I read too fast but it wasn’t clear to me that there were separate modes with distinct behavior. Even the OP of this issue does not make clear that there are three disjoint modes; it seems to ascribe properties of only some of the modes to all fenced frames. But taking this to be the case:
-
Mode 1 seems like it’s only relevant to Turtledove and so probably doesn’t belong in this group, since Turtledove is not a work item of this group and hasn’t even been proposed as one, so we’re not in a position to design a Turtledove-specific feature.
-
I have concerns about mode 2 because it’s not clear how exfiltration would be prevented, in particular with a cooperating third-party script in the embedding page. In any case, this mode does not seem like a privacy improvement. It seems in many ways less privacy-preserving than regular third-party iframe, which has no third-party storage access that would have to be protected against joining in the first place. Additionally, there doesn’t seem to be an obvious way to load third-party content without network access except the suggested Web Bundles, a feature that does not have broad standards-track consensus at this time. (Even with Web Bundles, care would have to be taken that the bundle URL is itself not user identifying.)
-
I am extremely skeptical about mode 3 because it’s not clear if there is a sensible definition of “any url but one that is guaranteed to not be user identifying”. Without solutions to this and other identified unsolved problems, this mode seems to be a pure negative for privacy as it would allow joining of identity and activity across sites.
-
It’s disappointing that none of these modes is strictly less powerful than a third-party frame, i.e. all the protections against accessing info from the containing page, but no access to unpartitioned storage.
Overall, there’s one modes specific to Turtledove, and two that seem likely to violate the stated privacy rules of most browsers.
Additionally, none of these modes seem appropriate for the use case stated in OP of a mortgage calculator given sensitive user data (but expected not to exfiltrate it). Mode 1 doesn’t work, because a browser crated opaque URN would not be able to communicate the user’s financial info. It can’t be mode three because the user’s sensitive financial info (presumably communicated in the URL) would appear to be user-identifying. The closest mode is 2, but such a calculator would not have any need for access to its unpartitioned storage; providing access to it seems to create unnecessary risk.
It would be helpful to revise the explainer and the OP of this issue to explain the three modes and the use cases for each. As it is, it’s difficult to review whether this work is a good fit for Privacy CG since the information provided does not reflect the multi-mode approach in #25 (comment)
Thanks for the feedback Maciej. While we've mostly focused on the ads use cases, we feel that there is potential for other use cases such that fenced frames might be a generally useful privacy primitive. If not, and folks here aren't interested in supporting ads use cases, we can look elsewhere.
Mode 1 seems like it’s only relevant to Turtledove and so probably doesn’t belong in this group, since Turtledove is not a work item of this group and hasn’t even been proposed as one, so we’re not in a position to design a Turtledove-specific feature.
Mode 1 also has the documented use case of lift studies via Shared Storage. We're interested in proposing Shared Storage to Privacy CG as well.
It’s disappointing that none of these modes is strictly less powerful than a third-party frame, i.e. all the protections against accessing info from the containing page, but no access to unpartitioned storage.
Note that Shivani provided a non-exhaustive list of possibilities and suggestions. We haven't fully explored this space. This is the time to have such discussions about what would be most useful. In my mind, we can provide for this by allowing fenced frames to be created with a normal URL or a urn:uuid provided by some other API (e.g., fledge/shared storage). I'd not intended to prevent URLs from being used and will clarify in the explainer.
I have concerns about mode 2 because it’s not clear how exfiltration would be prevented, in particular with a cooperating third-party script in the embedding page. In any case, this mode does not seem like a privacy improvement.
Given that the purpose of the fenced frame is to prevent communication between the embedder and the embedded, this is important to dig into. Exfiltration would be directly prevented by removing standard communication primitives. SPECTRE could be prevented with process isolation. Other side channels and leaks need to be identified and considered.
It seems in many ways less privacy-preserving than regular third-party iframe, which has no third-party storage access that would have to be protected against joining in the first place
Creating a Fenced Frame with a normal URL would have strictly more privacy than an iframe. Providing an opaque URL (urn:uuid) which contains a small amount of cross-site data allows for leakage of the contents of that data, but only on user gesture. Again, this is strictly more private than providing said data to a regular 3p-iframe.
One could imagine allowing requestStorageAccess (prompted or otherwise) within a fenced frame. By default, it would be harder for the 1p and third party to join their user identities. The URL of the frame would have to have the user's 1p identifier on it. Which, I admit, seems tricky to prevent.
Additionally, there doesn’t seem to be an obvious way to load third-party content without network access except the suggested Web Bundles, a feature that does not have broad standards-track consensus at this time. (Even with Web Bundles, care would have to be taken that the bundle URL is itself not user identifying.)
We're considering web bundles and we're also considering a notion of trusted networks. These would be networks that have agreed to a policy and to auditing to verify that user-identifying information is not retained.
I am extremely skeptical about mode 3 because it’s not clear if there is a sensible definition of “any url but one that is guaranteed to not be user identifying”. Without solutions to this and other identified unsolved problems, this mode seems to be a pure negative for privacy as it would allow joining of identity and activity across sites.
Right, this is hard. Brainstorming and proposals in this space are welcome and we'd like to explore it further.
Additionally, none of these modes seem appropriate for the use case stated in OP of a mortgage calculator given sensitive user data (but expected not to exfiltrate it). Mode 1 doesn’t work, because a browser crated opaque URN would not be able to communicate the user’s financial info.
Mode 1 would work with a normal URL created by the embedder, which is what I had in mind.
The idea is that the fenced frame document’s request headers carry both that 1) the document would be rendered in a fenced frame and 2) the mode it would be rendered in. A server can also deny a document to be embedded as a fenced frame by not including the opt-in header. This is important for pages that don't want to be embedded in other pages (login and payment providers for example) from being framed through a fenced frame.
The request header could be the following.
sec-fetch-dest: fenced-frame-mode
And the response could carry the header:
Supports-Loading-Mode: fenced-frame-mode
Where mode
could be the following:
opaque-ads, read-only-cookies etc.
-
opaque-ads cover cases like TURTLEDOVE and lift studies via shared storage where the publisher does not know the src of the document in the fenced frame. This could alternatively be called opaque-src to also include any non-ads use cases. The downside to that would be any ads-specific defaults cannot be applied e.g. Ads fenced frames will be disallowed certain features that are not expected from a good ad experience e.g. modals, presentation, orientation lock, pointer lock. For more details on this mode, please take a look at the design doc here. Welcome all feedback.
-
Read-only-cookies could cover use cases like one-tap sign in where once read-only cookie access is given to the FF, it is not allowed any network access and can only invoke browser APIs like the WebID. This is discussed more here.
Hey all, I just came across this very interesting proposal. A year or two ago I started working on redact, which is basically an implementation of fenced frames that allows websites to act as template providers, with the actual private content (e.g. name, submitted content, pictures, etc) filled in by the user's device via these frames.
The way we've implemented it is by running a small server on the user device that listens on localhost. The only URL possible for the frame is localhost to this endpoint, with a path equal to the data requested (e.g. https://localhost:8080/.firstName.). The protections we've implemented prevent client JS from successfully making AJAX requests to the same endpoint. We then combined this with a federated storage system and end-to-end encryption to make data storeable anywhere. This is all using existing iframe implementation, we haven't implemented anything new in the browser.
We have docs here and a "redacted" website up once you have the client running locally:
https://docs.redact.ws/en/latest/overview.html
And code is here:
https://github.com/pauwels-labs/redact-client
https://github.com/pauwels-labs/redact-store
https://github.com/pauwels-labs/redact-crypto
This is only one implementation of many but I think it acts as an interesting starting off point. Would love to be a part of whatever discussions or implementations may happen in this space.