nodejs/node

Tracking Issue: Syncify the ESM Loader

GeoffreyBooth opened this issue · 10 comments

The code under lib/internal/modules/esm, a.k.a. the ESM loader, contains many functions that are async. We should refactor as many of these as possible, ideally all of them, to be synchronous. This should improve the performance of evaluating ESM code, bringing it roughly on par with the speed of running CommonJS code.

Longer term, once the ESM loader is synchronous and we land the synchronous module customization hooks, we could deprecate monkey-patching the CommonJS loader and merge together the CommonJS and ESM loaders, eliminating duplication: #50356.

This issue will track our progress syncifying the various files and functions of the ESM loader until we can get as much of it to be as synchronous as possible.

The files to be updated, all underlib/internal/modules:

  • run_main.js: asyncRunEntryPointWithESMLoader
  • esm/fetch_module.js: fetchWithRedirects
  • esm/fetch_module.js: isLocalAddress
  • esm/hooks.js: Hooks class (the async methods here probably don’t need updating as they will be removed once we migrate to the synchronous customization hooks)
  • esm/hooks.js: nextHookFactory
  • esm/load.js: getSource #57419
  • esm/load.js: defaultLoad #57419
  • esm/loader.js: ModuleLoader.eval
  • esm/loader.js: ModuleLoader.getModuleJobForImport
  • esm/loader.js: ModuleLoader.loadAndTranslate
  • esm/loader.js: ModuleLoader.import
  • esm/loader.js: ModuleLoader.load
  • esm/module_job.js: ModuleJob._link
  • esm/module_job.js: ModuleJob._instantiate
  • esm/module_job.js: ModuleJob.run
  • esm/module_job.js: ModuleJobSync.run
  • esm/translators.js: wasm handler, via translators.set('wasm', ...
  • esm/utils.js: importModuleDynamicallyCallback
  • esm/utils.js: initializeHooks (might not need updating as we will remove this once the synchronous customization hooks land
  • esm/worker.js: customizedModuleWorker (might not need updating as we will remove this once the synchronous customization hooks land
  • esm/worker.js: handleMessage (might not need updating as we will remove this once the synchronous customization hooks land

@nodejs/loaders @mcollina @JakobJingleheimer @joyeecheung

Maybe moving it in the loaders repo would be more appropriate since its a effort that is gonna require different iterations?

This task is better handled by a large group of people, and this repo is more visible. Once the first few steps are done, it can easily be tagged as "good first issue"

I would like to mention my issue here.
Please check if it's worth fixing.

0hmX commented

Hi @mcollina and @marco-ippolito,
Is anyone currently working on any of these issues? I’d love to help resolve a few. Could you recommend a good starting point? I’d like to begin with something low-risk—what would be the best issue to pick?

Also what is the best method to test if my code is improving performance?

This may be necessary to prevent a race condition like prettier/prettier#17139 (comment)

@cu8code Sorry to not notice your comment earlier. If you'd like to help with this effort, you can pick any of the functions listed in the top issue and try to refactor them to be synchronous. The lowest hanging fruit are the ones that are async functions but never actually await anything (perhaps because they used to contain async logic but no longer do, for example). Besides prioritizing the easiest parts, I would also prioritize the hottest code paths: the functions involved in initial load of the most common use case like node app.js where app.js is an ES module. I would make the PRs as small as possible so that any resulting issues can be isolated as narrowly as possible, along the lines of #57390.

  • esm/fetch_module.js: fetchWithRedirects

  • esm/fetch_module.js: isLocalAddress

Can be removed since we removed http imports

re: esm/load.js getSource

is there any reason the calling functions can't call getSourceSync (or even the whole function replaced with getSourceSync everywhere it's called?

/**
 * @param {URL} url URL to the module
 * @param {ESModuleContext} context used to decorate error messages
 * @returns {Promise<{ responseURL: string, source: string | BufferView }>}
 */
 async function getSource(url, context) {
  const { protocol, href } = url;
  const responseURL = href;
  let source;
  if (protocol === 'file:') {
    const { readFile: readFileAsync } = require('internal/fs/promises').exports;
    source = await readFileAsync(url);
  } else if (protocol === 'data:') {
    const result = dataURLProcessor(url);
    if (result === 'failure') {
      throw new ERR_INVALID_URL(responseURL, null);
    }
    source = BufferFrom(result.body);
  } else {
    const supportedSchemes = ['file', 'data'];
    throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(url, supportedSchemes);
  }
  return { __proto__: null, responseURL, source };
}

 /**
 * @param {URL} url URL to the module
 * @param {ESModuleContext} context used to decorate error messages
 * @returns {{ responseURL: string, source: string | BufferView }}
 */
 function getSourceSync(url, context) {
  const { protocol, href } = url;
  const responseURL = href;
  let source;
  if (protocol === 'file:') {
    source = readFileSync(url);
  } else if (protocol === 'data:') {
    const result = dataURLProcessor(url);
    if (result === 'failure') {
      throw new ERR_INVALID_URL(responseURL);
    }
    source = BufferFrom(result.body);
  } else {
    const supportedSchemes = ['file', 'data'];
    throw new ERR_UNSUPPORTED_ESM_URL_SCHEME(url, supportedSchemes);
  }
  return { __proto__: null, responseURL, source };
}

@ntindle no, there isn't 👍 getSource and getSourceSync were deduplicated in the PR(s) that syncify defaultLoad. I'm gonna collapse that comment and this as resolved in the interest of keeping this issue easy to grok.

#57749 adds everysync as decided in the collab summit.