bokand/web-annotations

Explicitly list the creator/consumer groups intended

Opened this issue · 2 comments

This is a super-interesting space to play in, but exactly who is publishing what to where drastically changes many aspects of it, from the content model to privacy considerations to abuse potential. We should explicitly figure out what groups we're addressing, and how. I've given this a little thought, and initially identified five possible broad groups which each have distinct models:

1: Website -> Public

This is basically "footnotes for the web". People have dabbled in this area for decades but never gotten far enough to be a serious effort. It's pretty high value, imo, but very difficult to do right. There's significant a11y considerations, and form-factor issues-- the ideal display for a footnote is completely different for a desktop, a phone, and a printed page. Browsers doing the heavy lifting here, with a clear semantic role and linkage, and default display customized to the form-factor, could bring a lot of value to what is currently a drastically underserved space.

Privacy and abuse potential is nil here; this is identical to any other content on the page.

The hardest part here will be figuring out how, and how much, pages are able to style their footnotes from the default. Likely this'll involve some work with OpenUI for specialized opt-ins that let the page take over displaying when they want to.

1A: Small Group Blessed By The Website -> Public

This is a possible blending of (1) and (4), where a website can bless a public annotation source as applying to their page by default. Example would be articles with commentary from staff.

Theoretically this is handleable by just giving everyone access to the CMS (at which point it collapses to (1)) and that might be sufficient (or relying on CMSes to integrate the ability to consume annotation sources and bake them in). It might also be handleable by just letting the site ask users to subscribe to a public annotation source they point to. But if we have annotation controls exposed by the browser to let users show/hide annotation streams individually on a page, I think it makes sense to let a page request an annotation source in their metadata, and possibly toggle it on or off by default.

2: Personal -> Personal

Private notes for personal use, attached to some notion of browser identity and available cross-device via browser sync. I've wanted this for a long time: leaving notes on people's Twitter profiles about why I followed or blocked them; leaving notes on tech articles about what worked or didn't; etc.

Theoretically this can be (and is) done via extensions today, but it's problematic in several regards:

  1. Notes are either stored on-device, losing cross-device sync, or they're stored on servers controlled by the extension, losing privacy and gaining a dependency on the company surviving (which has a pretty bad track record; every reference on the Moz wiki page is now dead).
  2. Not available on all form factors; extensions don't work on mobile, where people do most of their computing.
  3. Requires giving the extension read/write access to every page, which is a huge abuse vector.

Abuse potential is nil, since it's self-generated and self-viewed, equivalent to just taking notes on a notepad next to the screen. Privacy is nil beyond the existing browser-sync privacy issues.

3: Small Group -> Itself

Examples include spouses sharing notes with each other, study groups annotating info pages, moderation groups in a community leaving notes to each other on particular pages/users, etc.

If this is opt-in, direct abuse potential is nil, similar to any group chat. You'll join groups with people you like, and can leave whenever you want if they get abusive.

However, the design must ensure the list is private with approval; anyone who knows your email address (or whatever identifier) shouldn't be able to sign up without notice. This, then, means you have to be careful not to create a new communication channel in the form of invites and/or join requests, because every comm channel will be used to spam and abuse. Possible idea is silent double-approval, where both an invite from A->B and a request from B->A must be received, with no notice of either, and only when they both exist is the invitation auto-approved.

There's some additional UI challenges here, because people creating notes will want to be able to swap whether a note is personal, or in a given group. Also, when you see a note you'll need an indication of what you're seeing it from.

Also, maintaining the groups is probably something browsers or other identity providers would have to take on for themselves, with at least a little community management (designating some people as capable of giving invites, others as capable of adding notes, and others as just readers). Doable, just needs some careful work.

I suspect this is the hardest group to target, overall, but it's valuable and deserving of thought.

4: Small Group -> Public

Examples include extensions like Shinigami Eyes (annotates transphobic twitter accounts), fact-checkers, etc.

Mode here is fully public - anyone can subscribe without approval if they know the URL or whatever.

This is, essentially, just RSS. We don't have to do any management ourselves, we just have to know how to consume annotation feeds and display them. We might actually want to use RSS as a bare-bones transfer format, but probably also want a more efficient format with better delta updates, as these can easily be enormous feeds.

Abuse and privacy are nil; they're exactly equivalent to an RSS feed you can choose to subscribe to or leave. Anti-abuse mechanisms on the public web are usually somewhere between "shitty" and "non-existent", but this isn't a new vector in any significant way.

5: Public -> Public

Notes left from the arbitrary public to the arbitrary public.

In short, don't. This is a horrific abuse vector that we can sink infinite money into and never get a proper handle on. It is impossible to moderate. We must not try. See #1 for more details on why this is impossible and deeply damaging when we inevitably fail at it.


These are all the basic usage scenarios I've been able to come up with so far. They're likely not complete.

Thanks for the thoughtful write-up Tab. From this list, the use cases that I'm most interested in serving are 2-4. Some comments/questions below.

FWIW - I don't see the browser as (necessarily) providing any kind of service/hosting here; purely implementing the protocols and UI surfaces necessary to enable third party hosts to provide annotations to users. In this world, users could subscribe their user agent to services they choose, akin to installing an extension or following an RSS feed, in which case they would get annotations for all available target pages (or some form of blind-to-the-service fetch-per-URL). On individual documents, authors could (via e.g. <link> tags) point the user agent at a URL to fetch annotations from. Links could embed services/content in them for the user agent to display on load which enables outside authors or users to annotation the destination content.

1: Website -> Public

Could you elaborate on why we'd need any browser support here at all? If the author is the one providing footnotes on their own page, why can't they just do so with markup directly on the page? Or is it similar to other standard (e.g. form) controls where the author can roll their own today, but gets consistency, accessibility, platform specific UI, etc. for free by using a browser built in control?

If that's the case, work we do here could provide some building blocks for solving this -- and we should keep it in mind when making any design choices that could affect such a solution in the future -- but it isn't the problem we're trying to solve (but do see its value).

2: Personal -> Personal

Agree this is a great use case but I don't think there's much to do here in terms of web platform APIs. Basically, browsers can and do implement this today (I believe old edge had some form of this). Having non-personal annotation would basically give this to a browser for free since it collapses down to having personal ones as well (in terms of browser UI and support).

3: Small Group -> Itself

However, the design must ensure the list is private with approval; anyone who knows your email address (or whatever identifier) shouldn't be able to sign up without notice

I'm not sure I understand this part -- why would this be different from any other "closed group" service? Or maybe we're just coming at this from different ideas of how this would work...

The way I see this working is that annotation services could provide the ACL layer for this use case. My canonical example is WhatsApp -- it could provide a private-to-me URL that points to an annotation service. That endpoint would serve up (and also provide a write endpoint) only annotations made by people in my WhatsApp group. Only members of a WhatsApp group are able to see each others annotations, the ACL/auth is performed by WhatsApp on their end - potentially just using private URLs.

4: Small Group -> Public

This is, essentially, just RSS. We don't have to do any management ourselves, we just have to know how to consume annotation feeds and display them.

This is basically how I see all the use cases working. The WebAnnotation protocol already defines the mechanisms for discovering and fetching annotation data so I think we could just build on that.

5: Public -> Public

Agree this isn't that interesting and comes with many challenges. But I wouldn't explicitly block it either. If someone wants to create a service that allows anonymous public read/write meant for the whole world...well, they can -- users have to opt-in to it. I as a user simply wouldn't subscribe to it.

Could you elaborate on why we'd need any browser support here at all? If the author is the one providing footnotes on their own page, why can't they just do so with markup directly on the page? Or is it similar to other standard (e.g. form) controls where the author can roll their own today, but gets consistency, accessibility, platform specific UI, etc. for free by using a browser built in control?

Yup, footnotes are incredibly fiddly to get right, especially wrt accessibility and desktop vs modal vs print. I'm sure somebody, somewhere has gotten them right, but I sure haven't run into them.

But really this is just a separate feature entirely that happens to live in the same conceptual space; we can push it off.

The way I see this working is that annotation services could provide the ACL layer for this use case. My canonical example is WhatsApp -- it could provide a private-to-me URL that points to an annotation service. That endpoint would serve up (and also provide a write endpoint) only annotations made by people in my WhatsApp group. Only members of a WhatsApp group are able to see each others annotations, the ACL/auth is performed by WhatsApp on their end - potentially just using private URLs.

The basic value-add for doing it ourselves is that we can protect user's privacy and data longevity more effectively than a third-party service. Essentially just the browser being an annotation-feed provider as well. But keeping that as a separate task that we can build on top of the more general case seems fine.

Use-case (2), in particular, seems like an easy value-add for browsers to do for themselves; just including a private annotation service in the browser sync data. If imagined on top of the "annotation service with feed" model this doesn't need any special handling; it'll just be an additional service a browser could provide for its users. Importantly, it seems like a good, simple way to bootstrap people's interest in the feature, rather than waiting for momentum to build around third-party annotation-feed providers.

So yeah, no need to address this in the spec itself; it's built on top.

This is basically how I see all the use cases working. The WebAnnotation protocol already defines the mechanisms for discovering and fetching annotation data so I think we could just build on that.

I haven't read that spec yet, but 👍

Agree this isn't that interesting and comes with many challenges. But I wouldn't explicitly block it either. If someone wants to create a service that allows anonymous public read/write meant for the whole world...well, they can -- users have to opt-in to it. I as a user simply wouldn't subscribe to it.

Right, but then it's just (4) but awful, and that's fine. I just want to make sure we stay far away from the use-case of a built-in "anyone can publicly annotate and we'll show you all the public annotations" model. If a third-party service wants to get in regulatory trouble for hosting illegal content, more power to them.

A lot of discussion in this space seems to naively gravitate toward this idea and ignore or minimize the spam/abuse potential; see the Moz wiki page I linked for some comments to this effect about this model promoting "democracy". There's a lot of "interesting" problems in this space that we could spend time on, and I want to make sure we don't waste effort when the end result will just be awful anyway.