nodejs/help

Require cache equivalent in es modules

tcodes0 opened this issue Β· 34 comments

  • Node.js Version: 14.4
  • OS: macos 10.14.6
  • Scope (install, code, runtime, meta, other?): modules
  • Module (and version) (if relevant):

See this SO question. https://stackoverflow.com/questions/51498129/require-cache-equivalent-in-es-modules

My use case: With cjs we can use the require object and its props to invalidate a module's cache so a new require loads it fresh. Can't seem to find a way to do the same with esm. Context: repl code reloading applications.

related dwyl/decache#51

The esm cache is not exposed, and even if it were, it is not mutable.

@Thomazella this is clearly a missing piece of the integration, so it's worth thinking about and there are a few ways these use cases can be approached.

The simplest today would be to create a new VM Module (@devsnek's work!) environment for each unique context. The issue here is clearing the registry means clearing the entire context. Realms might touch on this in a similar way in future.

More fine-grained reloading of modules could be supported by permitting registry-level deletion of the cache certainly. One could imagine a require('module').deleteLoaderModule(id) or similarly that purges individual modules from the cache so they are forced to be loaded fresh next time. And this is also what the SystemJS loader permits via System.delete.

Node.js is in a different position to the browser on this problem because it has the driving need first, but the difficulty with this stuff is that Node.js is so used to following the browser and other specs, it's difficult to pick up the batton on this one and lead the spec and integration work which is what we really should be doing for this problem.

If someone were interested in a PR for deleteModule or similar as described above I would personally be all for merging that. There are complexities and wider concerns that come up with this work but it does just need someone to pick it up and work through the considerations, including possibly engaging at a v8 / spec level.

I would be against such an API being added. Either the use case is hot reloading and should be brought up with V8 or the code should be rewritten to not need to be reloaded.

@devsnek this is not a controversial feature, it is just a tough one! Smalltalk had this as a basic feature, it is hardly novel or concerning.

Bringing up hot-reloading with v8 sounds like a very fruitful direction to me. One could even imagine a hot-reloading API that pushes out the live binding updates (like Smalltalk offered too).

v8:10476 has info about the removal of live edit and may be a good place to express a want of hot reloading.

To be clear I personally do not have resources to drive this but am glad to engage where I can.

@devsnek could you please point me to some issues, discussion or code to understand better the way forward here? I'm mainly missing why exposing the cache would be bad and why v8 seems the best alternative.

Can't seem to find the v8 issue mentioned too. Is it hosted somewhere else than v8/v8?

Is there any undocumented way to get the ESMLoader instance? I am looking for ways to get a list of loaded modules.

The answer is yes, check out https://stackoverflow.com/questions/63054165/is-there-any-way-to-access-an-internal-node-js-module

rlnt commented

I am still searching for a good way to do this.
What I am doing right now is cache-busting which causes a memory leak but it's better than rewriting everything.
I rely on the "type": "module" option.
https://ar.al/2021/02/22/cache-busting-in-node.js-dynamic-esm-imports/

to be clear to anyone who wanders across this issue: there is no implementation node.js could provide that would not leak memory. if you wish for such a thing, please direct that to V8.

is there somewhere a V8 issue already? πŸ€”

as a potential alternative, depending on the use case, worker threads could be facilitated. if a module is being hot [re]-loaded] in a worker thread itself, if handled correctly, there will be no memory leak. for the reload process itself, the worker thread would have to be terminated.

This use case I have been supporting in my project, wherein users can specify a named export at an agreed upon path location (think file based routing for a static site / SSR generator) by which to return some HTML or other supported browser destined code.

// /src/routes/artists.js
const fetch = require('node-fetch');

async function getContent() {
  const artists = await fetch('http://..../api/artists').then(resp => resp.json());
  const artistsListItems = artists.map((artist) => {
    const { id, name, bio, imageUrl } = artist;

    return `
      <tr>
        <td>${id}</td>
        <td>${name}</td>
        <td>${bio}</td>
        <td><img src="${imageUrl}"/></td>
      </tr>
    `;
  });

  return `
    <h1>Hello from the server rendered artists page! πŸ‘‹</h1>
    <table>
      <tr>
        <th>ID</th>
        <th>Name</th>
        <th>Decription</th>
        <th>Genre</th>
      </tr>
      ${artistsListItems.join('')}
    </table>
  `;
}

module.exports = {
  getContent
}; 

The issue here is that for development, where we provide file watching + a live reload server (so not HMR), the developer can be making changes to their server side code and though the page reloads in their browser, their server side content will never change.

// what I currently do
const routeLocation = path.join(routesDir, `${url}.js`);

if (process.env.__GWD_COMMAND__ === 'develop') {
  delete require.cache[routeLocation];
}

const { getContent } = require(routeLocation);

if (getContent) {
  body = await getContent(); // gets a bunch of HTML from the developer
}

So when I move to ESM, I would like to be able to maintain this developer experience

// /src/routes/artists.js
import fetch from 'node-fetch';

async function getContent() {
   ...
}

export {
  getContent
}; 
// what I would like to be able to do
const routeLocation = new URL(`${url}.js`, routesDir);
const { getContent } = await import(routeLocation);

if (getContent) {
  body = await getContent();
}

I also have a use case that would benefit from having a immutable import cache exposed: I'd like to know whether a certain module was ever actually loaded or not, e.g. in CommonJS, I can check this with:

const wasLoaded = Object.keys(require.cache).includes(require.resolve("./file.js"));

In ESM it could maybe use maybe like import.meta.cache to expose the same:

const wasLoaded = Object.keys(import.meta.cache).includes(new URL("./file.js", import.meta.url));

Came across this when looking into how to support hot reloading in ESM. It would be ideal to do that without needing to restart the node process.

the code should be rewritten to not need to be reloaded.

Is there anyway to write code that doesn't need to be reloaded when it changes? (thinking face)

My use case is file system routing. Everytime a user edits a route, that module needs to be re-imported with the new code changes. (In the dev server). Don't see anyway around it.

I'm author of a testing library called zUnit which uses the require cache when discovering building test suites, i.e.

// some.test.js
describe('My Test Suite', () => {
  it('should pass', () => {
  })
}

Given the above, users have the option of either building an explicit suite...

const { Suite } = require('zunit');
const someTest = require('./some.test.js');
const suite = new Suite('Example').add(someTest);

or automatically building the suite...

const { Suite } = require('zunit');
const suite = new Suite('Example').discover();

Since the tests do not explicit export anything, after requiring them, zUnit retrieves the module from the require.cache and retrospectively adds the export from within the describe and it functions.

I'd like to update zUnit to work with both mjs and cjs modules, but it does not appear the above is possible using dynamic imports, because there is no equivalent of the require.cache, and no obvious way of obtaining a reference to a module in order to programmatically add the export

I may be able to find an alternative way to both explicitly and automatically building the test suites, but thought I'd add another use case into the mix

I thought import.meta.cache was implemented and I saw myself as a duh here. my bad. https://stackoverflow.com/questions/74215997/how-do-i-do-console-log-in-esm-or-es6-file-without-having-a-package-json-in-the?noredirect=1#comment131031974_74215997 Is there a way I can access the import cache somehow?

A hacky way for getting the import paths that someone with a bigger brain can improve on to catch any edge cases I'm missing:

import fs from "fs";

function getImportPaths(filePath: string) {
    const file = fs.readFileSync(filePath, "utf8");
    const modImports = file.split("\n").filter((line) => line.startsWith("import ") || line.startsWith("} from "));

    const paths = [];

    for (const modImport of modImports) {
        const arr = modImport.match(/["'].*["'];?$/) ?? [];
        
        if (arr.length) {
            let match = arr[0];

            if (match.endsWith(";")) {
                match = match.substring(0, match.length - 1);
            }

            match = match.substring(1, match.length - 1);

            paths.push(match);
        }
    }

    return paths;
}
simlu commented

Sorry for the long post. I wanted to document the complexity needed to work around the limitations documented in this ticket.

If we had good support for hot reloading, none of the below would be necessary


In CJS we were able to delete the cache and this didn't cause any memory leaks if done correctly.

In ESM, since the cache can no longer be deleted, we need to employ workarounds. Those workarounds always cause memory leaks, so we have to be very careful with what we re-import aka "hot reload".

Currently the test framework in question aka lambda-tdd work as following:

  • Every test gets process.env.TEST_SEED set to a unique identifier on execution. This allows distinction between currently running tests.
  • When a new test is detected, we force invalidation of certain imports using the experimental-loader
  • Invalidation is done through the loader.resolve() by appending a querystring parameter to the returned url
  • We can't just invalidate all imports as that would be slow and cause massive memory leaks
  • Instead, to determine which imports to invalidate, we look at two things:
  1. Comment /* load-hot */ in the file. This always forces invalidation
  2. Environment variables and their values. We compute the hash of those and use that to re-import the file. This prevents unnecessary re-imports.

This approach still causes a memory leak, but it is small enough that hundreds of tests still execute successfully

wi-ski commented

This worked for typescript atm

function requireUncached(modulePath: string): any {
  try {
    // Resolve the module path to get the absolute path
    const resolvedPath = require.resolve(modulePath);

    // Delete the module from the cache
    delete require.cache[resolvedPath];
  } catch (e) {
    // Handle any errors that might occur
    console.error(`Failed to remove module from cache: ${e}`);
  }

  // Re-import the module
  return require(modulePath);
}

It seems there has been no activity on this issue for a while, and it is being closed in 30 days. If you believe this issue should remain open, please leave a comment.
If you need further assistance or have questions, you can also search for similar issues on Stack Overflow.
Make sure to look at the README file for the most updated links.

bump

for a while now using this to force load updated(or perhaps not) esm modules:

const app = await import(`./path/to/file.mjs?_=${ new Date().getTime() }`)

not sure about memory leaks though, so using this only for dev, inside a Vite plugin that loads a Koa app.

Hi! Thanks for the feature request. Unfortunately, it doesn't look like this will happen, as seen in nodejs/node#45206. If you'd like to keep this open, please let me know, but otherwise, I'll close this as not planned.

Please feel feee to request a re open if you still believe something should be done about this

This landed on node 22: --experimental-require-module. It allows mocha --watch to work with ESM modules, which was broken feature due to require.cache. Maybe it can help you out.

Edited:

Here is the person who found the "fix"
https://joyeecheung.github.io/blog/2024/03/18/require-esm-in-node-js/

The docs
https://nodejs.org/api/modules.html#loading-ecmascript-modules-using-require

Here is where I first saw the merge
nodejs/node#51977 (comment)

@icetbr Do you have more details and maybe some links to the relevant code / docs?

bump for @simlu. Not sure you gets notification on edits?

Here is the guy who found the β€œfix”

That’s not Joyee’s pronoun.

Also, while --experimental-require-module is great, I’m not sure what it has to do with the request on this thread. #2806 (comment) is still the best solution that works today.

That’s not Joyee’s pronoun.

Sorry, didn't find anything in the blog about it.

With cjs we can use the require object and its props to invalidate a module's cache so a new require loads it fresh.

Isn't this exactly what this does? You can use require to import your ESM modules, and invalidate them in the old way, with require.cache.

Edit
It's even retro compatible, you don't have to change a single line of code I think.

That’s not Joyee’s pronoun.

Sorry, didn't find anything in the blog about it.

With cjs we can use the require object and its props to invalidate a module's cache so a new require loads it fresh.

Isn't this exactly what this does? You can use require to import your ESM modules, and invalidate them in the old way, with require.cache.

Edit It's even retro compatible, you don't have to change a single line of code I think.

Not working.

// a.mjs
export const value = Date.now();
// main.cjs
setInterval(() => {
  delete require.cache[require.resolve("./a.mjs")];
  console.log(require("./a.mjs").value);
}, 1000);
$ node --experimental-require-module main.cjs
(node:17632) ExperimentalWarning: Support for loading ES Module in require() is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
1726122007831
1726122007831
1726122007831

Another sample using loaders. Now working.

// loader.js
import { register } from "node:module";

const { port1, port2 } = new MessageChannel();

export function clearImportCache(path) {
  port1.postMessage({ name: "clearImportCache", path });
}

if (!import.meta.url.includes("?loader"))
  register(`${import.meta.url}?loader`, {
    data: { port: port2 },
    transferList: [port2],
  });

const cache = {
  versions: new Map(),
  reload: new Set(),
};

export async function initialize({ port }) {
  port.on("message", (msg) => {
    if (msg.name === "clearImportCache") {
      if (msg.path) {
        cache.reload.add(msg.path);
      } else {
        cache.reload.add(...cache.versions.keys());
      }
    }
  });
}

export async function resolve(specifier, context, nextResolve) {
  const result = await nextResolve(specifier, context);
  if (!cache.reload.has(result.url)) return result;
  const version = (cache.versions.get(result.url) || 0) + 1;
  cache.reload.delete(result.url);
  cache.versions.set(result.url, version);
  const url = new URL(result.url);
  url.searchParams.set("version", version);
  return { ...result, url: url.href };
}
// file.js
export const value = Date.now();
// main.js
import { clearImportCache } from "./loader.js";

setInterval(async () => {
  clearImportCache(import.meta.resolve("./file.js"));
  const { value } = await import("./file.js");
  console.log({ value });
}, 1000);
$ node --import ./loader.js main.js
{ value: 1726143122772 }
{ value: 1726143123777 }
{ value: 1726143124789 }

This issue has been closed for months. If you are experiencing a new issue, please open a new issue