nodejs/node

Improve documentation on Dual Module Packages

MicahZoltu opened this issue ยท 17 comments

  • Version: v14.4.0
  • Platform: Windows 10
  • Subsystem: Windows

What steps will reproduce the bug?

git clone git@github.com:MicahZoltu/dual-module-package-repro.git
cd dual-module-package-repro/app-package
npm install
node index.mjs
# notice it works
node index.cjs
# notice it does not work

Then remove "type": "module" from library-package/package.json and repeat the process:

npm install
node index.mjs
# notice it doesn't work
node index.cjs
# notice it does work

How often does it reproduce? Is there a required condition?

Always.

What is the expected behavior?

The ability to ship an NPM package that can be used by either CJS users (any version of NodeJS) or ESM users (running NodeJS 14+)

What do you see instead?

I can either define type: module and it will work as an ESM, or I can not define it or set it to CommonJS and it will work as a CJS module, but I cannot define the package.json such that the package works in both environments.

Additional information

The error indicates that NodeJS is correctly following the exports path and locating the right module in each situation, however it proceeds to load the module using a loader picked by the presence of the type property in package.json. My expectation is that if it loads via the exports: { import: ... } entrypoint then that entire call stack from there down should be ESM, and if it loads via exports: { require: ... } then the entire call stack from there down should be CJS.

Error when type: module is not set:

dual-module-package-repro\library-package\index-esm.js:1
export const apple = 'apple'
^^^^^^

SyntaxError: Unexpected token 'export'
    at wrapSafe (internal/modules/cjs/loader.js:1116:16)

Error when type: module is set:

internal/modules/cjs/loader.js:1216
      throw new ERR_REQUIRE_ESM(filename, parentPath, packageJsonPath);
      ^

Error [ERR_REQUIRE_ESM]: Must use import to load ES Module: ...\dual-module-package-repro\library-package\index-cjs.js
require() of ES modules is not supported.
require() of ...\dual-module-package-repro\library-package\index-cjs.js from ...\dual-module-package-repro\app-package\index.cjs is an ES module file as it is a .js file whose nearest parent package.json contains "type": "module" which defines all .js files in that package scope as ES modules.
Instead rename index-cjs.js to end in .cjs, change the requiring code to use import(), or remove "type": "module" from ...\dual-module-package-repro\library-package\package.json.

    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1216:13)

The documentation makes it sound like NodeJS 14 supports packages that can be used as either ESM or CJS, but thus far I have not figured out how to actually accomplish this.

I believe this specific problem could be worked around by renaming files to .cjs and .mjs in the library-package. However, in the real world this solution is far less tenable because I'm compiling from TypeScript which does not do import rewrites during emit (and they maintain a very strong position that they have no intent to ever change this policy). This means that I cannot have TypeScript emit files where the import statements have different extensions depending on the module type being targeted. So in order to have everything be mjs in the ESM build output and cjs in the CJS build output I would need to run my code through a transformer that rewrites all import statements (both static and dynamic) and also rename all of the files to have extensions by output folder.

Note: Even if this is "not a bug", I believe the documentation could use an update to include a working example of a dual module package. Ideally, that example would be one that can be extended to be used in the real world and can work when compiling from TypeScript (i.e., an example that says to just rename all your files and all of your imports isn't particularly useful). It is also worth noting that the library should be native browser compatible as well, which means following the "MJS wrapper around CJS" pattern won't work.

For reference, the library-package/package.json contains:

{
	"name": "library-package",
	"version": "1.0.0",
	"main": "./index-cjs.js",
	"exports": {
		"import": "./index-esm.js",
		"require": "./index-cjs.js"
	},
	"type": "module"
}

Setting "type": "module" makes Node.js interpret all .js files as ESM, including index-cjs.js. When you remove it, all .js files will be interpreted as CJS, including index-esm.js.
If you want to support both with .js extension, you should create two subfolders:

$ mkdir ./cjs ./esm
$ echo '{"type":"commonjs"}' > cjs/package.json
$ echo '{"type":"module"}' > esm/package.json
$ git mv index-cjs.js cjs/index.js
$ git mv index-esm.js esm/index.js

And then have your package exports point to those subfolders:

{
	"name": "library-package",
	"version": "1.0.0",
	"main": "./cjs/index.js",
	"exports": {
		"import": "./esm/index.js",
		"require": "./cjs/index.js"
	},
	"type": "module"
}

Are there any side effects to putting a mostly-empty package.json in a subfolder like that? When I do npm publish of the root does everything else work the same, despite the nested package.json files?

Correct, npm uses only the "root" package.json file IIRC. I've used this on some of my packages, and haven't got any issue from nested package.json files.

I have updated the title of this to be a documentation improvement that illustrates the above thing with multiple output folders and package.json files.

@MicahZoltu that's cool and here's another working example (tested with node 12):
https://github.com/miraclx/xbytes/blob/d799f659791f7473263e91732dc153d3b13f705a/package.json#L6-L9
Edit: that's

"exports": {
    "require": "./dist/index.js",
    "import": "./dist/index.mjs"
  },

Note the .mjs extension - a crucial point to make it work.
I have yet to figure out how to expose both "./package.json" along with that config.

@kapouer #34515 (comment) worked for me back when I tested it after learning. All of my projects now have two tsconfig.json files (since I compile from TS) which write .js files to output-esm and output-cjs folders. Each of those folders has a package.json file with a single line in it (see linked comment).

The reason I strongly prefer this way over .mjs filenames is because I can compile from TS without needing any additional tooling to rename all of my output files. I can load files from output-esm in a browser without any additional tooling, and I can execute output-esm or load it as a library in NodeJS 14+ in an ESM project.

While the multiple-output-folder approach works, I do consider the OP to be a bug: the exports.import field will always be pointing to a file with ESM exports, so why does NodeJS not use that information as a "hint" to identify that specific file as ESM (ie. type: "module")?

The same sort of "missed opportunity" is true for a module: "./output.esm.js" field; NodeJS "ignores the hint", and treats the target file as commonjs, when it explicitly is not.

Taking these "hints" into account does "complicate" the interpretation process a bit -- but only a bit, and it seems worth it, to avoid the need for people to use separate output folders all the time. (or variant extensions, which has drawbacks for build tools like TypeScript)

the exports.import field will always be pointing to a file with ESM exports

This was discussed recently in #38740. The gist of it is the assumption is not correct: you could pass anything to the "import" field: an ES module of course, but also a WASM module, a JSON module, or even a CJS module.

Taking these "hints" into account does "complicate" the interpretation process a bit -- but only a bit, and it seems worth it

It would actually complicate it by a lot ๐Ÿ˜… also it brings other questions such as "what happens if those hints lead to a file being both ESM and CJS?", it would have to introduce all sorts of exception which could become tricky to document. See the linked discussion for more details.

That being said I agree we could/should use those hints when printing the error message to help users debug their package.json setup; someone in a recent PR is actually working on improving those error messages: #39175.

The gist of it is the assumption is not correct: you could pass anything to the "import" field: an ES module of course, but also a WASM module, a JSON module [...]

For WASM and JSON: AFAIK there are only two valid values for the package.json's type field, correct? (those are the only values that VSCode offers for auto-filling anyway) If so, then NodeJS must already have logic for properly responding to encountering those formats, regardless of whether type: "commonjs" or type: "module" is specified.

My proposal merely changes the implicit module-type, for the specific file referenced under exports.import, from commonjs to module. So I don't see how that would break the importing of WASM/JSON, given that it's not broken with the implicit type: "commonjs" that already exists.

The gist of it is the assumption is not correct: you could pass anything to the "import" field: [...] even a CJS module.

What use case is there for someone to ever put a CJS module under the import field? Why would they not instead use exports.require?

This was discussed recently in #38740.

I read through the full thread. The two main counters to the proposal seems to be:

  1. It's currently valid for module authors to create exports values like { "./foo": { "import": "./file.js" }, "./bar": { "require": "./file.js" } }, which are ambiguous given the proposal. However, I consider the above value self-contradictory (ie. NodeJS will error on one of them), so I don't see why it should even be supported.
  2. The post at the end claims that it creates confusions of resolution when absolute paths are used. I'm not an expert at node's internals, but this doesn't ring true to me; NodeJS is apparently already able to "find the package.json file" above each module, and use it to read type: "module" declarations. If it already finds the type field reliably, I don't see why it cannot find the exports field reliably. (perhaps I am misunderstanding his point)

"what happens if those hints lead to a file being both ESM and CJS?"

As mentioned above, I don't currently see any use-case for marking a file under both exports.import and exports.require, as this seems self-contradictory to me. Are there known use-cases for this, or is it theoretical only?

For WASM and JSON

I was referencing something like this:

{
  "exports": {
    "import": "./esm-data.json",
    "require": "./cjs-data.json"
  }
}

It would be wrong to assume those JSON file are either ESM or CJS IMHO. But I guess you were discussing .js files only, so that's a bit off topic, please disregard.

What use case is there for someone to ever put a CJS module under the import field? Why would they not instead use exports.require?

I don't know any valid use case, but using the "require" field would be produce very different results:

// a.cjs
module.exports = 'I was imported';

// b.cjs
module.exports = 'I was required';

// index.cjs
console.log(require('this-package')); // I was required
import('this-package').then(console.log); // { default: 'I was imported' }
{
  "name": "this-package",
  "exports": {
    "import": "./a.cjs",
    "require": "./b.cjs"
  }
}

I don't know why anyone would need that tbh, but that's the current state of things.

2. The post at the end claims that it creates confusions of resolution when absolute paths are used. I'm not an expert at node's internals, but this doesn't ring true to me; NodeJS is apparently already able to "find the package.json file" above each module, and use it to read type: "module" declarations. If it already finds the type field reliably, I don't see why it cannot find the exports field reliably. (perhaps I am misunderstanding his point)

Currently, when parsing a .js file, node will search for a package.json file, and stops and soon as it finds one. That's because the nearest package.json overrides what parent package.json may have defined, so there is no need to look further. "exports" is different.
You can define an export path in a sub-directory, so the parser would have to search every parent directory until it reaches the root of the file system, and parse very package.json it encounters, a crawl every "exports" tree it finds (which are sometimes huge). That seems less than ideal from a performance perspective. You could decide to limit it to the nearest package.json, but it would still be less performant, and I'm not convinced it would help users as much as it would create confusion for the edge cases that would arise.

Just wanted to link to a real-world use-case where NodeJS's non-utilization of the exports/jsnext:main/module field for determining module types has the result of requiring the end-developer to manually patch the package.json files of over half a dozen different packages: apollographql/apollo-client#8396 (comment)

Yes, it's possible to solve this using the approaches described above (eg. creating a dist/esm subdirectory with its own package.json file, or using the .mjs extension for esm-module outputs), but the fact that there are so many packages that provide esm versions without yet utilizing those approaches, seems like evidence that those workarounds might be seen as "non-ideal" to a lot of those library authors as well.

For me, the main reasons I don't like those two workarounds:

  • For "multiple package.json files": I don't like the increased complexity of creating multiple output folders / package.json files (as it adds unnecessary nesting, and can complicate things for some build tools).
  • For "use .mjs extension": My main annoyance here is that it complicates the setup for various tools when the extension differs from the standard .js. TypeScript is one of them. as described by the OP:

I believe this specific problem could be worked around by renaming files to .cjs and .mjs in the library-package. However, in the real world this solution is far less tenable because I'm compiling from TypeScript which does not do import rewrites during emit (and they maintain a very strong position that they have no intent to ever change this policy). This means that I cannot have TypeScript emit files where the import statements have different extensions depending on the module type being targeted. So in order to have everything be mjs in the ESM build output and cjs in the CJS build output I would need to run my code through a transformer that rewrites all import statements (both static and dynamic) and also rename all of the files to have extensions by output folder.

My guess is that several of the package maintainers have avoided use of those workarounds as well for the same (or related) reasons. Because of this, it leaves the end-user in the annoying situation of having to pester the maintainers of many different packages to change their output structure (with the drawbacks mentioned above), fork the packages themself, or manually modify their local copies using something like patch-package. That last option is what I'm going with for now; but it's less than ideal due to the need to maintain those patch files when updating the packages, as well as the fact that npm's modification of package.json files during installation necessitates workarounds for patch-package to be able to operate. (disabling the exclusion of those files, and then manual modification of the diff files that are generated to only include the actual/manual changes made)

Indeed, apparently even the Microsoft TypeScript developers assumed as I (and others) have that the module-type would be inferred from exports.import: The TypeScript compiler, with importHelpers: true, emits the following at the top of transpiled files:

import { __assign, __spreadArray } from "tslib/tslib.es6.js";

(An example of that typescript output can be seen here, from this source file and this tsconfig.json file.)

For reference, here is the package.json file for TypeScript's tslib library, which is used to resolve the import statement above:

{
    "name": "tslib",
    "author": "Microsoft Corp.",
    "homepage": "https://www.typescriptlang.org/",
    "version": "2.3.0",
    "license": "0BSD",
    "description": "Runtime library for TypeScript helper functions",
    "keywords": [...],
    "bugs": {
        "url": "https://github.com/Microsoft/TypeScript/issues"
    },
    "repository": {
        "type": "git",
        "url": "https://github.com/Microsoft/tslib.git"
    },
    "main": "tslib.js",
    "module": "tslib.es6.js",
    "jsnext:main": "tslib.es6.js",
    "typings": "tslib.d.ts",
    "sideEffects": false,
    "exports": {
        ".": {
            "module": "./tslib.es6.js",
            "import": "./modules/index.js",
            "default": "./tslib.js"
        },
        "./": "./"
    }
}

Can anyone see what the problem is?

The problem, it seems, is that the TypeScript team, like hundreds of other package maintainers, have assumed that if their package.json file has so many hints about the type of their es6 output file:

    "module": "mylib.es6.js",
    "jsnext:main": "mylib.es6.js",
    "exports": {
        ".": {
            "module": "./mylib.es6.js",
            // tslib uses the proxy-file "./modules/index.js" here, to try to set `type: module` for it
            // but it didn't actually help, since the transpiler import is for "tslib/tslib.es6.js" directly
            // and even marking that file under the "import" field here, makes no difference
            "import": "./mylib.es6.js"
        }
    }

...that NodeJS would recognize the mylib.es6.js file as esm code, and thus parse it that way.

But instead, NodeJS looks at the package.json file above (notice the four strong hints) and says, "hmm, mylib.es6.js looks like commonjs to me". And then it errors, because it doesn't find any of the exports.myExport = lines that it expected.

This is just one example out of many, of widely used Javascript packages which have misunderstood (or neglected to implement, due to inconvenience) the changes necessary for NodeJS to recognize a mylib.es6.js output file as an es6/esm module.

I've hit so many build-blockers due to this issue, due to so many modules being affected by this flaw in NodeJS (well, I consider it a flaw anyway), that I've been working on this issue for over half of a day, and still have not completed all the fixes. It seems sometimes like I'm one of very few developers who are actually attempting to use NodeJS in esm-import mode.

Many libraries have esm outputs, but like half or more of them have "broken" esm builds, with the majority of these "broken" merely because NodeJS is ignoring all of the exports.import, exports.module, jsnext:main, and module hints.

The pain continues. I keep hitting the issue over and over whenever I try to add "ESM-output" packages to my NodeJS program. (I'm now paranoid whenever I try to add a package)

Switching to ESM-import mode in NodeJS has produced hours of time-loss because of this single issue (there is no easy way to solve this as a consumer of a library -- requiring forking of libraries and/or persistently pestering the maintainers of everything I try to import). And that it is not being seen as a serious issue suggests to me that I am one of very few people actually using NodeJS in ESM-import mode.


If the inference of ESM-mode based on hints from module, jsnext:main, exports/X/module, and exports/X/import is really unable to be implemented, can there at least be some sort of fallback hatch that a consumer can use when they encounter (the hundreds of) libraries that have broken ESM-outputs, to force NodeJS to interpret the module as ESM, without having to spend hours pestering every package maintainer upstream?

Someone will surely say, "It's good that you're being forced to solve the issue upstream, so we can get it fixed for everyone."

But that doesn't sound so good of a solution when you're (seemingly) the only one using NodeJS in ESM mode, and most maintainers are not interested enough in the issue to fix it themselves. (Issues like this are why adoption of ESM-mode is moving so slowly imo; there is not a smooth path of transition to it, because of this and other issues. But this is the main one, by far, in my experience.)

Basically, until there is something in the build/publish process itself that is preventing package developers from continuing to make this mistake, the mistake will keep on happening, and ESM-output builds will keep being broken.

Until such a warning/error is implemented, consumers should be able to have some way of saying, "Yes, this module really is an ESM module -- interpret it as ESM and let me continue developing." Without that, usage of NodeJS in ESM-import mode is far more pain than it is worth.

can there at least be some sort of fallback hatch that a consumer can use when they encounter [...] libraries that have broken ESM-outputs, to force NodeJS to interpret the module as ESM [...]?

Something like this for NodeJS would be amazing: https://typestrong.org/ts-node/docs/module-type-overrides

This would (mostly) solve my pain-point, because it would allow me to just import directly from the dist/esm folder within the libraries, and force NodeJS to recognize that folder as ESM from my own project, without having to pester lots of library authors and wait for them to merge the fixes.

I think this kind of solution would create a non-negligible performance decrease if node has to check the whole file system up to the root to see if a package.json overrides the module type. It would also add a lot of complexity to the spec and the implementation, not sure it would be worth it. You can always give it a shot and open a PR proposing this change if that's something that interests you.

BTW if you don't care about transitive deps and just want your own dependencies to behave, you should look into yarn patch, or patch-package: those solution lets you override a remote package without the need of forking it โ€“ note that it would solve the issue for your dev env only, if you are writing a library that other folks depend on, the patch won't apply there (you'd need to bundle those deps, and/or keep pestering other library authors).

azu commented

I've created tsconfig-to-dual-package for making TypeScript library dual package based on @aduh95 comment: #34515 (comment)
One question, Is this behavior described in Node.js document?


The nearest parent package.json is defined as the first package.json found when searching in the current folder, that folder's parent, and so on up until a node_modules folder or the volume root is reached.
...
If the nearest parent package.json lacks a "type" field, or contains "type": "commonjs", .js files are treated as CommonJS. If the volume root is reached and no package.json is found, .js files are treated as CommonJS.
https://nodejs.org/api/packages.html#type

Edit: Ah, I missed it.
(I was checking only Dual CommonJS/ES module packages section for finding this behavior.)