WICG/scheduling-apis

Specify event loop integration / priority intent

shaseley opened this issue · 7 comments

Forking this from here into its own issue.

Problem: The way scheduler task priorities are integrated into the event loop in the spec is underspecified. The original goal was to allow UAs maximum flexibility to integrate these into their schedulers, but the intent is not clear, which is bad for other implementors and compat.

Task Priorities in Blink

This section describes how Blink implements priority mappings, included for reference for how postTask() fits in. Typically, scheduled tasks run in priority order as follows:

Blink Priority Task/Source
Highest Input: Discrete input, e.g. clicks and typing (NOT User Interaction task source), and input blocking tasks (hit test IPC)
Rendering: The first frame after discrete input, some use cases on Android, e.g. main thread gestures and gesture start (touch start)
Very High Rendering: if we haven't rendering in > 100 ms (rendering starvation prevention). This includes rAF-aligned input.
Internal tasks, including find-in-page and IPC forwarding (postMessage)
High 'user-blocking' postTask() tasks
Normal The vast majority of task sources, including networking, timers, posted-message, user interaction, database, etc.
'user-visible' postTask() tasks
Rendering default state
Low 'background' postTask() tasks
Rendering: during compositor-driven scrolling (under experiment to remove)
Best Effort Idle callbacks (during idle periods), a couple internal task types

This is the current state, but there are quite a few inactive — and a few active — experiments I left out.

Notes:

  • Rendering is a special case. We treat rendering (including rAF, rAF-aligned input, paint, etc.) as its own task (compositor task queue), and its priority is very dynamic, fluctuating between low and highest
  • 'user-blocking' can starve other tasks, but not input or rendering
    • Discrete input is always selected first
    • Rendering, including rAF-aligned input, has a starvation prevention mechanism

Specifying Priority

The intention of postTask() priorities:

  1. 'user-blocking' tasks should generally be considered higher priority than other tasks -- especially other task scheduling methods, including setTimeout() and same-window postMessage() (which is used as a scheduling API). But they shouldn't indefinitely block input and rendering.
  2. 'user-visible' tasks should be ~= other ways of scheduling tasks
  3. 'background' should be lower, but can run outside of idle periods

A challenge is that UAs are currently free to choose rendering opportunities and schedule between task types as they see fit, and I don't want to create a total ordering of tasks and stifle UA experimentation. But we need to make clear the intention and consider providing stronger guarantees.

This is also a challenge for scheduler.yield(), where want stronger guarantees (see this section). I wrote there about maybe using a denylist approach, where the relative ordering vs. some tasks sources is specified, providing some guarantees. Doing that for both APIs would be simpler/cleaner, and maybe a good place to start? And it could be potentially augmented with notes about intent and guidance.

FYI @sefeng211 and @smaug----. Working on this, I'll try to get a draft of a change sometime soon. Thanks for pointing it out.

Thanks for the summary!

I just realized some of those priorities in Chrome don't quite match Gecko. Especially the "Very High rendering" vs. "Normal rendering", so it is a bit unclear when we'd run 'user-blocking'.
But I take that the basic idea is that user-blocking would normally run before 'rendering update', right?

Also, do different tasks in normal priority have then some "subpriority", or do they get processed in the order they were created? Can 'user-visible' task be handled before pending 'Rendering default state' (I assume that means HTML spec's rendering update)

And sorry about being so late with this. I had lost the email about this issue but sefeng reminded me about this recently.

I just realized some of those priorities in Chrome don't quite match Gecko. Especially the "Very High rendering" vs. "Normal rendering", so it is a bit unclear when we'd run 'user-blocking'. But I take that the basic idea is that user-blocking would normally run before 'rendering update', right?

For some additional context, in Chrome the "update the rendering" steps happen in a separate task internally, which used to be scheduled at "normal priority". I think we used normal priority (vs. on every vsync) because rendering too much added latency to loading, for example. But that could result in too much lag in updating the UI, so we periodically boost the priority of these tasks to strike a balance. IIUC Gecko has a different approach (decreasing frame rate during loading?).

We also didn't want user-blocking tasks to indefinitely starve rendering updates, and periodically boosting the priority of rendering ensures that. The thought is if user-blocking tasks are updating the dom, we don't want to starve showing that to the user. Also, if the browser decides to process input while user-blocking tasks are pending (which Chrome does), we'd want to update the UI asap.

So the idea is that "user-blocking" tasks are a type of HTML task (and run in that part of the event loop steps), and the priority is meant to help choose between other HTML tasks (typically favored, but shouldn't block critical tasks like input), and choosing rendering opportunites wouldn't change.

Also, do different tasks in normal priority have then some "subpriority", or do they get processed in the order they were created?

Yeah, 'Rendering default state' means "update the rendering steps" -- which happens in a separate task in Chrome. The scheduler will run tasks in priority order, and posting/created order within that priority -- so no subpriorities within normal. You could probably think about "user-blocking" as a sub priority above normal, and "background" as a subpriority below normal.

One other note: I know the HTML spec allows different task sources to be prioritized separately -- which could be implemented using subpriorities within normal priority, but in Chrome we mostly don't (input is the main exception, and recently we experimented with boosting priority of image loading tasks on the main thread, which helped improve CLS during loading). Partly we don't know what should be prioritized without developer hints, and extending the postTask priorities to other APIs (within reason) could be helpful (this was the last bit in my TPAC talk).

Can 'user-visible' task be handled before pending 'Rendering default state' (I assume that means HTML spec's rendering update)

In Chrome, yes -- it depends on the relative order they were scheduled. This is also true for other task sources, e.g. MessageChannel tasks, which is sometimes used for scheduling. As mentioned, in the past this could lead to rendering getting starved, so we periodically boost the priority. Does Gecko run rendering after every task?

And sorry about being so late with this. I had lost the email about this issue but sefeng reminded me about this recently.

No worries! Thanks for the questions. LMK if discussing this further in a WHATWG triage call/WebPerfWG session would make sense.

Gecko deprioritizes animation frames during loading, so in practice that ends up reducing frame rate often.

In Gecko "update the rendering steps" is also a separate task, it happens to be rather high priority. Input events are higher priority - or rather aligned with the task used for "update the rendering steps". So input events are usually dispatched right before rAF. Doesn't blink have rAF aligned input events too (except perhaps pointerrawupdate).

Gecko tries to avoid starving rAFs and normal priority tasks. So if processing rAF takes (as an example on a common monitor) > 16ms, we do let normal priority tasks to be handled. And if normal priority tasks take lots of time, we do ensure rAF gets processed every now and then (our limit there, 4 * frame rate, isn't that different to Blinks 100ms ).

But ok, maybe in Gecko 'user-blocking' tasks could be processed between input events and rAF (but perhaps rAF needs to get through every now and then, if 'user-blocking' tasks are slow).
And 'user-visible' could be lower priority, lower than rAF, so possible normal priority or mediumhigh, which we have between normal and vsync(rAF).
And 'background' would be Gecko's 'low'.

Scheduling is tricky.

(Looks like I use "update the rendering steps" and rAF to mean basically the same thing)

So I think in the spec we should define at least a bit more precisely how user-blocking and user-visible are expected to work vs rendering update. We probably want to let UAs to still tweak their scheduling quite a bit, if needed, but the high level expectations should be documented.