AT Automation API Roadmap
zcorpan opened this issue · 5 comments
This is a proposed roadmap of milestones for the AT Automation API specification (see https://github.com/w3c/aria-at-automation#proposal-specify-a-new-service-to-compliment-webdriver )
The relative order of the milestones below are somewhat arbitrary, and some could be rearranged or happen in parallel. Any dependencies on other milestones are documented. Security considerations for each milestone is also documented.
- Milestone 0: Protocol #19
- Milestone 1: Settings #22
- Milestone 2: Capture output #25
- Milestone 3: Keypresses #26
- Milestone 4: Activate commands
- Milestone 5: Internal state
- Milestone 6: Headless mode
MVP is milestones 0 through 3.
Milestone 0: Protocol
Design an architecture, API shape, protocol.
security
- opt in to API
- use an existing widely supported network protocol (e.g. WebSocket, like WebDriver BiDi)
Milestone 1: Settings
Vendor-specific settings (also see #16)
security
- opt in to API
Milestone 2: Capture output
API to capture spoken output without changing the TTS voice (also see #24)
security
-
opt in to API
-
sandbox (e.g. do not capture output when the expected applications do not have focus)
Milestone 3: Keypresses
API to simulate keypresses (also see #12)
security
-
opt in to API
-
not HID level simulated keypresses
-
sandbox (e.g. do not allow sending keypresses when the expected applications do not have focus)
-
session
Milestone 4: Activate commands
Vendor-specific API to activate commands (also see #12). Example: go to the next heading. At minimum setting "modes" (as used in aria-at).
security
-
opt in to API
-
sandbox
-
session
-
exclude access to any security-sensitive commands
Straw-person message structure example:
{
"method": "nvda:activateCommand",
"params": {
"command": "change to browse mode"
}
}
Return Type: EmptyResult
Milestone 5: Internal state
Depends on: milestone 4
New API to expose internal state or information in screen readers that is not directly exposed to users but is still useful for testing purposes, e.g. virtual focus position, mode (interaction mode vs. reading mode). At minimum getting the current "mode" (as used in aria-at)
security
- opt in to API
- exclude access to any security-sensitive information
Straw-person message structure example:
{
"method": "nvda:getState",
"params": {
"state": "mode"
}
}
Return Type: TBD
Milestone 6: Headless mode
Depends on: milestone 2
Turn off output to TTS (headless mode) (also see #13)
security
-
opt in to API
-
signal to user somehow that SR is active (visual + audio)?
@zcorpan Thanks for writing this up! Some comments:
enunciate punctuation
This is quite a complex setting, so we'll need to scope out exactly what we want/need here. E.g. different screen readers have different predefined levels, but also some additional customisation on top of that (such as symbols dictionaries in NVDA).
Start reading
I don't know what this command is/would be expected to do. Do you mean starting a say all, to read from the cursor position to the end of the page? Note that we don't currently use that in any ARIA-AT tests.
Move to first status menu in menu bar
Not sure what this refers to. Which menu bar?
Find next/previous misspelled word
We don't currently have any ARIA-AT tests relying on this, and I'm not sure which screen readers even support it in virtual web content. Definitely doesn't seem like a Milestone 4 command to me.
enunciate punctuation
This is quite a complex setting, so we'll need to scope out exactly what we want/need here. E.g. different screen readers have different predefined levels, but also some additional customisation on top of that (such as symbols dictionaries in NVDA).
OK.
Start reading
I don't know what this command is/would be expected to do. Do you mean starting a say all, to read from the cursor position to the end of the page? Note that we don't currently use that in any ARIA-AT tests.
I believe that's what the command does, yes. I don't know if we need it for aria-at, though it might be useful for more general testing of websites or web apps.
Move to first status menu in menu bar
Not sure what this refers to. Which menu bar?
I'm not sure. It doesn't seem relevant for testing web content, so I'll remove it from the list.
Find next/previous misspelled word
We don't currently have any ARIA-AT tests relying on this, and I'm not sure which screen readers even support it in virtual web content. Definitely doesn't seem like a Milestone 4 command to me.
Indeed, I'll remove it.
Thanks!
For Milestone 4, I think we are missing Navigate to the previous element.
I've edited the milestones in OP to reflect our current thinking. In particular:
- Milestone 1, settings, are now vendor-specific and can include all settings (except any to exclude for security reasons)
- Milestone 4, activate commands, are similarly vendor-specific
- Removed milestones 6 and 7 (previously "more settings" and "more commands")
- Milestones 0 through 3 should represent a good MVP
Based on our conversation in the CG meeting yesterday (minutes), I think we should make the following adjustments to the roadmap:
- Milestone 4: Activate commands
- Milestone 5: Headless mode
- Milestone 6: Internal state
becomes
- Milestone 4: Activate commands - vendor-specific commands, at minimum setting "modes" (as used in aria-at)
- Milestone 5: Internal state - expose vendor-specific state, at minimum getting the current "mode" (as used in aria-at)
- Milestone 6: Headless mode