w3c/at-driver

AT Automation API Roadmap

zcorpan opened this issue · 5 comments

This is a proposed roadmap of milestones for the AT Automation API specification (see https://github.com/w3c/aria-at-automation#proposal-specify-a-new-service-to-compliment-webdriver )

The relative order of the milestones below are somewhat arbitrary, and some could be rearranged or happen in parallel. Any dependencies on other milestones are documented. Security considerations for each milestone is also documented.

  • Milestone 0: Protocol #19
  • Milestone 1: Settings #22
  • Milestone 2: Capture output #25
  • Milestone 3: Keypresses #26
  • Milestone 4: Activate commands
  • Milestone 5: Internal state
  • Milestone 6: Headless mode

MVP is milestones 0 through 3.

Milestone 0: Protocol

Design an architecture, API shape, protocol.

security

  • opt in to API
  • use an existing widely supported network protocol (e.g. WebSocket, like WebDriver BiDi)

Milestone 1: Settings

Vendor-specific settings (also see #16)

security

  • opt in to API

Milestone 2: Capture output

API to capture spoken output without changing the TTS voice (also see #24)

security

  • opt in to API

  • sandbox (e.g. do not capture output when the expected applications do not have focus)

Milestone 3: Keypresses

API to simulate keypresses (also see #12)

security

  • opt in to API

  • not HID level simulated keypresses

  • sandbox (e.g. do not allow sending keypresses when the expected applications do not have focus)

  • session

Milestone 4: Activate commands

Vendor-specific API to activate commands (also see #12). Example: go to the next heading. At minimum setting "modes" (as used in aria-at).

security

  • opt in to API

  • sandbox

  • session

  • exclude access to any security-sensitive commands

Straw-person message structure example:

{
  "method": "nvda:activateCommand",
  "params": {
    "command": "change to browse mode"
  }
}

Return Type: EmptyResult

Milestone 5: Internal state

Depends on: milestone 4

New API to expose internal state or information in screen readers that is not directly exposed to users but is still useful for testing purposes, e.g. virtual focus position, mode (interaction mode vs. reading mode). At minimum getting the current "mode" (as used in aria-at)

security

  • opt in to API
  • exclude access to any security-sensitive information

Straw-person message structure example:

{
  "method": "nvda:getState",
  "params": {
    "state": "mode"
  }
}

Return Type: TBD

Milestone 6: Headless mode

Depends on: milestone 2

Turn off output to TTS (headless mode) (also see #13)

security

  • opt in to API

  • signal to user somehow that SR is active (visual + audio)?

@zcorpan Thanks for writing this up! Some comments:

enunciate punctuation

This is quite a complex setting, so we'll need to scope out exactly what we want/need here. E.g. different screen readers have different predefined levels, but also some additional customisation on top of that (such as symbols dictionaries in NVDA).

Start reading

I don't know what this command is/would be expected to do. Do you mean starting a say all, to read from the cursor position to the end of the page? Note that we don't currently use that in any ARIA-AT tests.

Move to first status menu in menu bar

Not sure what this refers to. Which menu bar?

Find next/previous misspelled word

We don't currently have any ARIA-AT tests relying on this, and I'm not sure which screen readers even support it in virtual web content. Definitely doesn't seem like a Milestone 4 command to me.

enunciate punctuation

This is quite a complex setting, so we'll need to scope out exactly what we want/need here. E.g. different screen readers have different predefined levels, but also some additional customisation on top of that (such as symbols dictionaries in NVDA).

OK.

Start reading

I don't know what this command is/would be expected to do. Do you mean starting a say all, to read from the cursor position to the end of the page? Note that we don't currently use that in any ARIA-AT tests.

I believe that's what the command does, yes. I don't know if we need it for aria-at, though it might be useful for more general testing of websites or web apps.

Move to first status menu in menu bar

Not sure what this refers to. Which menu bar?

I'm not sure. It doesn't seem relevant for testing web content, so I'll remove it from the list.

Find next/previous misspelled word

We don't currently have any ARIA-AT tests relying on this, and I'm not sure which screen readers even support it in virtual web content. Definitely doesn't seem like a Milestone 4 command to me.

Indeed, I'll remove it.

Thanks!

For Milestone 4, I think we are missing Navigate to the previous element.

I've edited the milestones in OP to reflect our current thinking. In particular:

  • Milestone 1, settings, are now vendor-specific and can include all settings (except any to exclude for security reasons)
  • Milestone 4, activate commands, are similarly vendor-specific
  • Removed milestones 6 and 7 (previously "more settings" and "more commands")
  • Milestones 0 through 3 should represent a good MVP

Based on our conversation in the CG meeting yesterday (minutes), I think we should make the following adjustments to the roadmap:

  • Milestone 4: Activate commands
  • Milestone 5: Headless mode
  • Milestone 6: Internal state

becomes

  • Milestone 4: Activate commands - vendor-specific commands, at minimum setting "modes" (as used in aria-at)
  • Milestone 5: Internal state - expose vendor-specific state, at minimum getting the current "mode" (as used in aria-at)
  • Milestone 6: Headless mode