Writing Assistance APIs

Question

Writing Assistance APIs

Opened this issue a month ago · 1 comments

こんにちは TAG-さん!

I'm requesting an early TAG design review of the writing assistance APIs.

Browsers and operating systems are increasingly expected to gain access to a language model. (Example, example, example.) Web applications can benefit from using language models for a variety of use cases.

We're proposing a group of APIs that use language models to give web developers high-level assistance with writing. Specifically:

The summarizer API produces summaries of input text;
The writer API writes new material, given a writing task prompt;
The rewriter API transforms and rephrases input text in the requested ways.

Because these APIs share underlying infrastructure and API shape, and have many cross-cutting concerns, we include them all in one explainer, to avoid repeating ourselves across three repositories. However, they are separate API proposals, and can be evaluated independently.

Explainer: https://github.com/WICG/writing-assistance-apis/blob/main/README.md
User research: based on a series of prototyping sessions with partners, many separate applications were created using these APIs or their predecessors. These prototyping sessions were conducted confidentially (so as not to leak specific product plans), but from them we extracted the use cases listed in the explainer.
Security and Privacy self-review: https://github.com/WICG/writing-assistance-apis/blob/main/security-privacy-questionnaire.md
GitHub repo: https://github.com/WICG/writing-assistance-apis
Primary contacts:
- Domenic Denicola (@domenic), Google, editor
Organization/project driving the design: Google
Multi-stakeholder feedback:
- Chromium comments: We are excited to start trialing these APIs with developers through origin trials and behind-a-flag experiments.
- Mozilla comments: mozilla/standards-positions#1067
- WebKit comments: WebKit/standards-positions#393
- Web developers:
  - As mentioned above, based on a series of prototyping sessions we have heard significant excitement for using these APIs.
  - Public feedback on WICG/proposals#163 was mixed. To summarize, some themes we saw include: asking for more capabilities (e.g. full prompting of a language model instead of higher-level APIs (our response; multimodal support); desire to make sure the API actually works robustly in many real-world use cases; removal of any safety/ethical safeguards; and confusion about on-device vs. cloud APIs.

Further details:

I have reviewed the TAG's Web Platform Design Principles
The group where the incubation/design work on this is being done (or is intended to be done in the future): WICG
The group where standardization of this work is intended to be done ("unknown" if not known): not completely known, but we are discussing the APIs with the Web Machine Learning Working Group at TPAC, and it is possible a future version of their charter would welcome us.
Existing major pieces of multi-implementer review or discussion of this design: see above.
Major unresolved issues with or opposition to this design:
- We are aware of previous TAG feedback (in #948) regarding API surface details, and have captured that in the explainer.
- As with the translator/language detector APIs (#948), there is a tension between interoperability and exposing whether the model is on-device or cloud-based; we discuss this a bit more in the explainer.
- As with the translator/language detector APIs (#948), there are several privacy concerns, discussed in the explainer. We believe there are reasonable mitigations possible there, but will need to do some experimentation to find the best ones.
This work is being funded by: Google

You should also know that...

This is not a generic prompt API.

Answer 1 · 2024-09-28T18:46:01.000Z

A public note (without TAG consensus) so that @domenic can start thinking in this direction too: We should think about how https://www.w3.org/reports/ai-web-impact/ and https://www.w3.org/TR/webmachinelearning-ethics/ should affect our opinions here. For example, https://www.w3.org/reports/ai-web-impact/#transparency-on-ai-mediated-services considers the use of Model Cards to help people evaluate the suitability of particular models for particular purposes. How should that information be exposed to the web developers considering use of this API, and to the end-users who have to evaluate the website's output?