WICG/scheduling-apis

Consider extending the behavior of delay to support additional delay types

calinoracation opened this issue · 16 comments

While implementing early experiments using the postTask scheduler at Airbnb, we took some time to change our early initializers that load things like tracking, google maps, recaptcha and such to use the postTask scheduler. One thing that was incredibly helpful for us in this was extending delay to support waiting until a DOM event, such as load or DOMContentLoaded, we even allowed for a custom event that we fire post React hydration called hydrate-complete. The flexibility and power this gave us helped us to overwhelmingly maximize initial loading performance by deferring non-critical events until after our UI becomes interactive.

// Notice the usage of load to the delay argument instead of a number
postTask(initializeServiceWorker, { priority: 'background', delay: 'load' }); 

// here we wait for our custom hydrate-complete, waiting to do things like show
// a flash banner until just after we hydrate React
postTask(() => Flash.display(), { delay: 'hydrate-complete' });

While the hydrate-complete custom option might be questionable, the extension to allow waiting for standard DOM events seems genuinely useful, especially around 3rd party script loading and deferring non-critical tasks until later. We've seen issues where some of our 3rd party scripts do unfortunate things, like cookie checks or things that might case a repaint or reflow or do some really heavy computation and lead to unnecessary blocking, having this mechanism allowed us to improve both our total blocking time and first input delay considerably, and would love to get folks thoughts on if they think this would be a valuable addition to the spec.

I like this idea, probably as a follow-up to the MVP. Did this completely replace using numeric delay for you, or did you still have use cases for that? Also, did you ever combine the two?

I've never combined the 2 as far as needing the delay + numeric, and at times we still use the numeric, for example in the case of scrolling our listing cards, if you are in the viewport of a listing for 1 second, then we preload the sections API response for the detail view, if you stay another second, we download the javascript needed for the listing detail and potentially the hero image to improve the speed at which we can respond to a user clicking on the listing and getting the details view.

I've also implemented a custom hook in React to hook into these custom delays, the most prominent of which is used for things like delaying low priority components from rendering or interfering with the critical path until after a user is able to fully interact with the application. I use it like this:

const [renderAllComponents] = usePostTask({ delay: 'all-settled', priority: 'background' });

return (
  <>
    {renderAllComponents && (
      <>
        <PerformanceMonitor />
        <Prefetcher routes={routes} />
        <ServerEnvironmentMessage />
        <CookieBanner />
      </>
    )}

    <BottomTabBar />
  </>
);

Back to your original question of if I ever combined them, I suppose in a way maybe I am, but I was easily able to workaround it with code like this:

const controller = new TaskController('background');

// Wait for the document to finish loading. If I needed to combine these 
// into a single task I'd need the numeric and custom delay, but I instead 
// blocked on this delay then the numeric timeout.
await postTask(() => {}, { delay: 'load' });

const preloadedRoute = await postTask(
  () => preloadRoute<Sections>({ href }),
  { delay: 1000, signal: controller?.signal },
);

await postTask(() => preloadPrepare(preloadedRoute), { signal: controller?.signal });
await postTask(() => preloadComponent(preloadedRoute), { signal: controller?.signal });

@calinoracation That's great info, thanks!

That last example also illustrates something we've been thinking about regarding alternative API shapes for delay — that sometimes the current task wants to wait for something and resume, kind of like a conditional yield. In that case, it would be cleaner to be able to just wait rather than having to post an empty task. For example:

async function myTask(task) {
  ...
  // Wait for a specific event and then resume.
  await task.wait('load');

  // Wait for some amount of time and then resume.
  await task.wait(1000);
}

In that example, I'm using an explicit task context object that gets passed to postTask (by the browser). We're still working through the trade-offs here between that and something global, but the idea is similar. I think there's probably a case for both options — a postTask delay parameter and an API like this. The delay parameter is still cleaner in your examples where you just want to queue some delayed work and proceed.

One follow-up question: in your implementation of custom delays, what happens if the event already fired, e.g. if load already occurred but you queue a task that waits for load?

I'm curious in my example where I want to wait until load + a 1s delay to execute preloadRoute, how would that look in your example? I think I get the global option of something like scheduler.postTask.wait('load' || 1000), but how would that work with a specific task like preloadRoute where it's a single call potentially not behind any delay?

In my implementation of the custom delays, I have a map of the available queue delay types as the key and a Promise as the value, so it's totally fine to wait on an already fired event as the promise will be in the fulfilled state.

const eventToPromiseMap = new Map<QueueDelayType, Promise<unknown>>();

and then when getting the delay:

eventToPromiseMap.get(delayType)

I was just wondering, would we be able to invoke that globally or pass options, like postTask.wait('load', { priority: 'background', signal: aSignal }?

Sorry, I might have made this confusing by adding a separate API idea in the mix. I think there are two problems here:

  1. How do we schedule a task in the future with a delay — either event-based and/or time-based, and
  2. How do we make the current task wait

My opinion is it makes sense to have different options for both (1) and (2).

For (1), we have the postTask option for time-based delay already, and the question is whether or not we should add a separate option for event-based delay, or reuse delay. I could imagine something like this that would allow you to combine them:

// Run someFutureTask 1 second after load.
scheduler.postTask(someFutureTask, { event: 'load', delay: 1000 });

task.wait() (or scheduler.wait()) is for more of an idea for solving (2).

I was just wondering, would we be able to invoke that globally or pass options, like postTask.wait('load', { priority: 'background', signal: aSignal }?

A major design decision is whether or not the API should be global or tied to postTask in some way (e.g. as a method on an object passed to postTask). This matters for determining the task context, which includes the task's priority and signal, and is needed especially to determine how to resume. If it's tied to the current task (passed to postTask), then a wait function would already have that information because it would be tied to the object. If there's a global function, then we'd need to either infer the context (challenging) or have it passed by the developer for each invocation.

The tied-to-the-current-task version would look something like this:

// Note the argument to postTask.
async function myTask(task) {
  initializeStuff();

  // Wait until load. This will resume at whatever |task|'s priority is.
  // We could also tie this to |task.signal| so the task aborts if the signal
  // aborts, but we'd need to make sure that's the desired behavior.
  console.log(task.priority);
  await task.wait('load');

  // Similarly, this will resume at |task|'s priority.
  await task.wait(1000);

  carryOn();
}

const controller = new TaskController('background');
const signal = controller.signal;
scheduler.postTask(myTask, { signal });

If we made it a global function, rather than tying it to a task, then it would look something like this:

async function myTask() {
  initializeStuff();

  // Wait until load. For resuming, either the priority needs to be determined by the
  // browser (i.e. by the browser propagating it), or it needs to be passed.
  await scheduler.wait('load');

  // Same here.
  await scheduler.wait(1000);

  carryOn();
}

const controller = new TaskController('background');
const signal = controller.signal;
scheduler.postTask(myTask, {signal});

I'm curious in my example where I want to wait until load + a 1s delay to execute preloadRoute, how would that look in your example? I think I get the global option of something like scheduler.postTask.wait('load' || 1000), but how would that work with a specific task like preloadRoute where it's a single call potentially not behind any delay?

This depends on whether or not we include the separate postTask option. If we do, then you could do:

const preloadedRoute = await postTask(
  () => preloadRoute<Sections>({ href }),
  { delay: 1000, event: 'load', signal: controller?.signal },
);

If delay was either an event or numeric value (but not both), and we added the wait() function, then it might look something like this:

const preloadedRoute = await postTask(
  async (task) => {
    await task.wait(1000);
    preloadRoute<Sections>({ href });
  },
  {delay: 'load', signal: controller?.signal },
);

IMO having both options in postTask and in wait() (assuming we add it) seem reasonable. I think my proposal would be:

  1. Add a separate postTask option for event-based delay. This would be a nice, narrowly-scoped extension of postTask.
  2. Separately, work on a proposal for scheduler.wait() that allows time- and event-based delays. The design choices (global vs. task-based) have a lot in common with a yield() API, so we're looking at kind of co-designing this and writing up a new explainer.

That's incredibly helpful, thanks for explaining it in that way. I think in my mind the proposal for option 1 to add a separate option for event would work really well. Are you thinking the same semantics as far as if the event ever fired, it would keep track and allow the task to be scheduled?

Are you thinking the same semantics as far as if the event ever fired, it would keep track and allow the task to be scheduled?

That matches my intuition for one-off events, but it's not clear what should happen for events that can fire multiple times. A couple cases come to mind that might be relevant:

  1. Are there custom events like single page app navigations that can fire multiple times?
  2. document.write() can cause the load event to fire more than once (thanks @natechapin), so what should happen there?

I think the question is whether or not there are use cases for events that might fire more than once and that one would want to tie the task to the next occurrence. In (2), my intuition is that we'd want to clear the scheduler's state about the load event when we're starting a new load, and I think the same applies to (1) as well (assuming a non-unique event name).

I could see providing a mechanism for resetting the scheduler's tracking of a specific event. For internal events like load, that should be straightforward enough. For custom events, we'd need a way to communicate that to the scheduler. Alternatively the API could take an option for defining the behavior (i.e. an option that says "wait for the next event" vs. "wait if the event didn't already occur).

There are a few other related questions/concerns that I'll brain-dump into another comment soon and try to work towards a first-pass proposal.

Most of the events we listen to are all one-offs, but I can definitely see in a SPA needing to 'reset' these and that might be super useful. For example, things like delay: 'all-settled' could be re-used in waiting for the next page once the current one fires to let us do the exact same thing, like delaying mounting low priority components / paths / prefetching until the main image and simulated TTFMP occurs.

Having event as part of the API, would partly replace a scheduling implementation I wrote recently (see here for reference). What's missing for me, is the need for a timeout rather than a delay to prevent the starvation of resources. For example, consider a custom event like so:

postTask(myTask, { event: 'myCustomEvent' });

If myCustomEvent never fires for whatever reason, I would still like to guarantee myTask runs after some time. I believe introducing a timeout option like in requestIdleCallback would solve this problem, changing the API to this:

postTask(myTask, { event: 'myCustomEvent', timeout: 10000 });

Has something like this already been considered?

Additionally, I noticed in postTask: Enabling App-Specific Priorities, that user-defined priorities will possibly be supported through a rank option. For my use-case, I would like to schedule a task that only runs when all of the tasks in the previous priority (or rank) are finished. Would this be handled by rank itself, or would consumers need to solve this problem, potentially with custom events or hooks provided by the scheduler? Here is a more concrete example of what I am trying to achieve:

function createTask(id, duration) {
  return () => {
    console.log(`running task ${id}`);
    return new Promise(resolve => setTimeout(() => {
      console.log(`completed task ${id}`);
      resolve();
    }, duration));
  };
};

const customAppPriority = 1;
const task1 = scheduler.postTask(createTask('1', 3000), { rank: customAppPriority });
const task2 = scheduler.postTask(createTask('2', 3000), { rank: customAppPriority });

// task3 should only run when the tasks in the previous priority have completed (i.e. task1, and task2)
// or after the expected timeout
const task3 = scheduler.postTask(createTask('3', 1000), {
  // would this be needed, or could rank handle this automatically?
  event: 'customAppPriority:empty',
  rank: customAppPriority + 1,
  timeout: 5000,
});

// expected logs:
//
// running task 1
// running task 2
// completed task 1
// completed task 2
// running task 3
// completed task 3

I would love to hear your thoughts on how something like this could be achieved given the current direction of the API :)

Having event as part of the API, would partly replace a scheduling implementation I wrote recently (see here for reference). What's missing for me, is the need for a timeout rather than a delay to prevent the starvation of resources. For example, consider a custom event like so:

postTask(myTask, { event: 'myCustomEvent' });

If myCustomEvent never fires for whatever reason, I would still like to guarantee myTask runs after some time. I believe introducing a timeout option like in requestIdleCallback would solve this problem, changing the API to this:

postTask(myTask, { event: 'myCustomEvent', timeout: 10000 });

Has something like this already been considered?

Yeah this is a great point about the event-based delay. It did occur to me as something we'll need to figure out for this, but that's as far as I got. I think we'll need to consider something like this here for sure.

Would you want to use the timeout for tasks in general, or just those tied to events?

We have thought about adding a timeout value for starvation prevention (mentioned briefly in this part of the explainer), but we're holding off to see if this will actually be needed or not. We have some concerns about degraded performance caused by "timeout storms", e.g. lots of small timeout values timing out or a long task causing a bunch of timeouts to expire.

Depending if we want to apply the timeout only to events or tasks in general, we might need to do {event: {name: 'myCustomEvent', timeout: 1000}} or something like that (maybe accepting a string as shorthand). I think the difference would be when it's tied to an event, the timeout should post the task at the priority it was scheduled at; if not, it would need to increase the priority.

Additionally, I noticed in postTask: Enabling App-Specific Priorities, that user-defined priorities will possibly be supported through a rank option. For my use-case, I would like to schedule a task that only runs when all of the tasks in the previous priority (or rank) are finished. Would this be handled by rank itself, or would consumers need to solve this problem, potentially with custom events or hooks provided by the scheduler? Here is a more concrete example of what I am trying to achieve:

function createTask(id, duration) {
  return () => {
    console.log(`running task ${id}`);
    return new Promise(resolve => setTimeout(() => {
      console.log(`completed task ${id}`);
      resolve();
    }, duration));
  };
};

const customAppPriority = 1;
const task1 = scheduler.postTask(createTask('1', 3000), { rank: customAppPriority });
const task2 = scheduler.postTask(createTask('2', 3000), { rank: customAppPriority });

// task3 should only run when the tasks in the previous priority have completed (i.e. task1, and task2)
// or after the expected timeout
const task3 = scheduler.postTask(createTask('3', 1000), {
// would this be needed, or could rank handle this automatically?
event: 'customAppPriority:empty',
rank: customAppPriority + 1,
timeout: 5000,
});

// expected logs:
//
// running task 1
// running task 2
// completed task 1
// completed task 2
// running task 3
// completed task 3


I would love to hear your thoughts on how something like this could be achieved given the current direction of the API :)

Sure! Interesting. We've discussed the idea of an "async task queue" or "job queue" (bullet point three here). The basic idea is that each task is treated as a yieldy asynchronous task (see this WIP PR), and the next task doesn't start until the previous one ends (signaled by the promise resolving).

If we were to apply priority and rank to these "async task queues" or whatever abstraction, then I think they naturally have the properties you're looking for. But I don't think that would be the default for either postTask() or postTask() + rank.

Timeout definitely comes into play here as well since there is no guarantee the task's promise will be resolved! That's the part that I'm a bit worried about here, and something we'll have to think through a bit more.

Can you say more about your use cases? I do agree that this would be generally useful, but I'd be interested to learn more about a concrete example of where something like this would be used.

Would you want to use the timeout for tasks in general, or just those tied to events?

In general would be preferred, as I wouldn't want a lower priority (but still important) task to be starved, if higher priority tasks are continuously added to the queue.

the next task doesn't start until the previous one ends (signaled by the promise resolving).

If I'm understanding this correctly, it might not map well to the library API I'm working with. Specifically, we have two main priorities (or phases) right now which I will refer to as Phase A and Phase B. They are used to code-split React components, using a priority that indicates when the bundle should load, and thus render the component. It should later be extended to handle background tasks, that may or may not need a React render context. The usage would look like this for a simple page layout, where we want to prioritise loading the header and content, while delaying the sidebar rendering:

// The dynamic import here maps to a "task"
const Header = phaseA(() => import('./header');
const Content = phaseA(() => import('./content');
const Sidebar = phaseB(() => import('./sidebar');

export const PageLayout = () => (
  <>
    <header>
      <Suspense fallback="Loading header...">
        <Header />
      </Suspense>
    </header>
    <nav>
      <Suspense fallback="Loading sidebar...">
        <Sidebar />
      </Suspense>
    </nav>
    <main>
      <Suspense fallback="Loading content...">
        <Content />
      </Suspense>
    </main>
  </>
);

Currently Phase B is controlled, meaning that you must invoke an API to start loading this phase (B). This is usually called when all of the main content has rendered. I am trying to move away from the controlled model, and towards a more hybrid approach, where the phases are more automated. This will fit more of our product needs, and results in two types of behaviour:

  1. Fully automated
    • Phase A starts as soon as possible
    • Phase B will start when either:
      • there are no Phase A tasks left in the queue
      • the timeout occurs¹
  2. Controlled, with subsequent automation
    • Phase A starts as soon as possible
    • Phase B will start when either:
      • the controlled task yields, OR when there are no Phase A tasks left in the queue² (depending on optimal / possible approaches)
      • the timeout occurs
    • After the initial control, the fully automated approach should take place until the next page transition³

So then how would you schedule a task for Phase B, particularly with the automated approach, using the native scheduling API? My general needs and concerns here are:

  • I don't want to start loading a Phase B component, when the Phase A tasks at the time of rendering a Phase B component have completed, given that more Phase A tasks could come through in the same render pass. For example, given the page layout example above, I want to guarantee the sidebar loads last, even though it renders second, between two Phase A components.
  • Needing to manage some "empty" event is not ideal, as all of the Phase A tasks would need to be tracked. Otherwise there is nothing to really yield on, without adding a ton of complexity?

¹ The timeout might not be needed in this case, I haven't tested this in production yet to see the exact effects
² This could be a specific empty event, or inferred by the scheduler. For the latter, a deferred promise can be scheduled in Phase A, that resolves when the next phase is ready to start. The idea being that Phase B would start once this resolves, provided the scheduler waits for Phase A tasks to complete first.
³ It's quite possible we won't need this behaviour once we move towards entrypoints with Relay, and make everything available before we need it via better preloading

We've recently implemented per-page 'events' now, specifically for Time to our unique meaningful paint, but we have some pages that don't fire it (a bug but it happens). Our naive implementation defines the default timeout when we listen to the events like below, and we've found it to save us in a number of instances like @MonicaOlejniczak mentioned where the event otherwise wouldn't of fired.

{
      delay: 'TTFMP',
      type: 'per-page',
      options: {
        timeout: 5000,
      },
    },

For these the struggle is knowing when to 'reset' our per-page events. We have a performance recorder that I was able to hook into reasonably to do this, but it seems brittle. We currently call something like resetPageEvents when the page is changing and we cancel any outstanding listeners and wait for new ones.

I think something more granular might be good, as for one-time events like DOMContentLoaded, we'd never need to worry about that happening again.

We also ran against the issue of how early can we possibly listen for these events, which the native version wouldn't have an issue with.

I have a new idea of yield(), to resolve in in the language itself. That also enables the userland scheduler.

https://es.discourse.group/t/cooperative-async-function/975